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heuristic  is  likely  to  be  significantly  better  in  terms  of  worst-case  performance 
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C,  of  this  algorithm  ignoring  collision  is  OCN^-N/I^HT)  for  sparse  graphs  and 


0(MN  -  M*^2M+1)  for  both  dense  and  sparse  graphs.  The  second.  Tree  Linked 
Cluster  Algorithm  (TLCA)  runs  in  C  =  0(M)  and  T  =  0(N)  (also  ignoring  collisions) 
The  number  of  clusters  is  not  minimized  as  well  as  GCLA,  but  communication  costs 
are  significantly  lower.  Schemes  for  collision  minimization  and  resolution  -are 
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ABSTRACT 

Linked'Cluster  Architectures  have  been  suggested  in  the  literature  for  organizing  the 
radios  of  a  stationless  mobile  Packet  Radio  Network  (PRN).  Existing  algorithms  for  achiev* 
ing  such  architectures  do  not  attempt  to  minimize  the  number  of  clusters  and  gateway 
nodes,  aims  which  we  claim  are  essential  to  the  implementation  of  any  Multiple  Access 
scheme.  The  problem  is  formulated  on  graphs  in  3  different  ways,  all  of  which  are  NP- 
complete.  It  is  also  shown  that  e-Polynomial  Time  Algorithms  are  not  likely  to  exist.  A 
simple  centralized  heuristic.  Greedy  is  analyzed  in  terms  of  its  worst  case  fractional  error. 
It  is  then  argued  that  no  efficient  heuristic  is  likely  to  be  significantly  better  in  terms  of 
worst-case  performance.  Two  Distributed  Linked-Cluater  Algorithms  are  presented.  The 
first,  GLGA  minimizes  the  number  of  clusters  identicidly  to  Greedy.  The  communication 
complexity,  C,  of  this  algorithm  ignoring  collision  is  0(iV*  -  Ny/2M  +  1)  for  sparse  graphs 
and  0{MN  -  MyJ2M  -H  1)  for  dense  graphs  with  N  nodes  and  M  edges.  The  time  com¬ 
plexity,  T,  is  0{N  -  v'2Af  +  1)  for  both  dense  and  sparse  graphs.  The  second,  IVee  Linked 
Cluster  Algorithm  {TLCA)  runs  in  C  =  0{M)  and  T  =  0(N)  (also  ignoring  collisions). 
The  number  of  clusters  is  not  minimized  as  well  as  in  GCLA,  but  communication  costs  are 
significantly  lower.  Schemes  for  collision  minimization  and  resolution  are  suggested  for  both 
algorithms. 
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CHAPTER  I 
INTRODUCTION 


A  Packet  Radio  Network  (PRN)  is  a  communications  network  in  which  a  set  of 
geographically  distributed,  possibly  mobile  computers  communicate  over  a  shared  broadcast 
medium  (called  the  access  channel.)  The  network  is  packet  switched  to  facilitate  the  efficient 
management  of  bursty  user  traffic.  The  concept  of  bypassing  telephone  lines  in  favor  of  space 
as  the  transmitting  medium  for  packet  networks  originated  at  the  University  of  Hawaii  in  the 
early  1970’s.  Their  so<alled  ALOHA  system  consisted  of  a  single  onmidirectional  antenna 
which  served  as  a  hub,  or  station  for  aU  terminai  communication.  The  radio  channel  was 
divided  into  two:  a  multiaccess  channel,  and  a  broadcast  channel  which  was  used  by  the 
central  computer  to  communicate  with  the  terminals. 

The  success  of  the  ALOHA  network  stimulated  extensive  research  in  the  area,  and 
the  model  was  extended  to  suit  a  variety  of  tactical  and  civilian  situations,  some  of  which 
are  are  listed  below: 

(1)  Networking  mobile  radios  in  the  battlefield  [l]. 

(2)  Communication  in  a  Naval  Battlegroup  [5]. 

(3)  Commercial  applications  in  highly  uneven  terrain  [1]. 

(4)  Cellular  Mobile  Telephones  [12]. 

Notice  that  these  applications  require  that  the  users  be  possibly  mobile,  which  intro* 


duces  the  problem  of  designing  protocols  for  networks  whose  topology  is  subject  to  change. 
Another  issue  which  is  immediately  raised  is  the  effective  utilization  of  the  access  channel 
when  all  the  terminals  are  not  within  line  of  sight  of  each  other.  Suggested  future  ap> 
plications  for  PRN’s  bank  critically  on  the  assumption  that  these  problems  have  been,  or 
can  be  solved.  For  example,  Nelson  has  noted  the  suitability  of  PRN’s  to  establish  com¬ 
munications  following  a  natural  disaster  such  as  an  earthquake,  because  of  their  “flexible 
topologies”  [3];  Kahn  has  refered  to  “personal  terminals”  which  could  be  carried  about  by 
individuals  allowing  them  to  communicate,  via  a  PRN,  with  other  mobile  users,  and  with 
a  variety  of  computer  resources[2};  and  Licklider[4]  has  considered  a  rather  inventive  appli¬ 
cation  in  which  the  physical  condition  of  (mobile)  elderly  persons,  equipped  with  various 
sensor  devices,  is  continuously  monitored  by  a  home  computer  through  a  PRN. 

In  some  of  these  applications,  such  as  Cellular  Mobile  Telephones,  it  is  realistic  to 
assume  that  the  network  can  be  controlled  by  stations,  which  serve  as  fixed  reference  points 
in  the  otherwise  mobile  network.  Much  progress  has  been  made  on  this  front,  as  is  clew 
from  the  rapidly  growing  use  of  Mobile  Telephones.  However,  in  many  military  scenarios, 
such  as  networking  radios  in  a  battlefield,  one  cannot  assume  the  feasiblity  of  building 
stations.  This,  coupled  with  the  fact  that  mutual  line  of  sight  among  aU  the  radios  cannot 
be  assured,  complicates  the  problem  of  designing  efficient  protocols  for  network  organization 
and  mangagement,  and  presents  to  the  PRN  designer  a  challenge  she  has  not  as  yet  met 
successfully. 

The  next  few  sections  serve  to  justify  the  previous  statement,  and  to  aquaint  the 
reader  briefly,  with  some  of  the  design  issues  of  Mobile  PRN’s.  These  sections  will  also 
allow  us  to  present  the  specific  problem  the  rest  of  thesis  will  address,  and  to  present  the 
model  it  will  use  to  do  this.  For  good  surveys  of  PRN’s  see  [l]  and  [2]. 


1.1.  Techxilcal  System  Considerations: 


Three  technical  constraints  that  apply  to  the  design  of  any  PRN  are  outlined.  Since 
these  constraints  are  technology  dependent  ones,  we  emphasise  their  importance  here,  but 
do  not  dwell  on  them  at  any  length  in  subsequent  sections. 

(a)  Propogation  Loas:  It  has  been  found  that  the  propogation  loss  of  ground  radios  is  highly 
sensitive  to  the  type  of  terrain  the  PRN  is  located  in.  This  makes  the  estimation  of  link 
connectivity  very  hard  when  tmdulations  in  the  terrain  are  not  easy  to  predict,  and  strength* 
ens  the  case  for  finding  automated  network  management  procedures  capable  of  sensing  the 
connectivity  of  the  PRN  in  real-time. 

(b)  Multipath  effects:  These  are  reflections  of  the  original  signal  which  lead  to  superpo¬ 
sition  at  the  receiver  of  several  copies  of  the  signal.  This  causes  symbol  interference  and 
fading,  and  consequently  drives  up  the  error  rate.  Movement  by  a  mobile  user  of  only  a 
few  meters  can  cause  the  received  signal  strength  to  drop  below  the  threshold,  efi’ectively 
disabling  the  link.  It  then  becomes  possible  for  a  link  to  be  intermittently  disabled  and 
enabled  even  if  the  user  is  moving  about  within  a  radius  of  only  a  few  meters.  These  effects 
are  minimized  by  using  spread-spectrum  signalling,  which  is  a  combination  of  bandwidth 
expansion  and  coding.  It  is  assumed  in  this  thesis  that  multipath  effects  are  negligible. 

(e)  Capture  Effects:  A  packet  is  said  to  be  captured  at  the  receiver  when  all  successive 
interfering  packets  at  that  receiver  are  rejected  as  noise.  Unfortunately,  it  is  possible  that  a 
later  arriving  signal  with  a  considerably  greater  strength  might  destroy  a  captured  packet. 
It  is  not  clear  how  to  model  this  situation  exactly  (eg.  beyond  what  threshold  is  a  captured 
packet  actually  destroyed?),  and  so  we  make  the  worst  case  assumption  that  all  colliding 
packets  at  the  receiver  are  destroyed. 

For  example,  consider  the  situation  in  Fig.  l.l.  Everv  radio  has  .a  fixed  transmission 


radius  and  all  radios  within  this  radius  will  hear  it.  Let  A  be  transmitting  to  D.  Clearly 
B  can  hear  the  broadcast.  Now  suppose  that  C  would  like  to  transmit  to  B.  Then  some 
packets  &om  C  are  sure  to  collide  with  those  from  A,  at  the  node  B. 


£k*_U 

Captiire  AMumption:  =  transmission  raditis  of  A;  tq  =  transmission  radius  of  C. 

We  assume  that  all  packets  from  C  and  A  transmitted  at  overlapping  time  intervals 
will  be  destroyed  at  node  B. 

It  is  worth  observing  here  that  in  fixed  topology  PRN’s,  the  transmission  radii 
be  fixed  to  an  optimum  so  that  the  effects  of  collisions  ue  minimised,  and  throughput  is 
maximised.  However  in  a  mobile  environment,  the  network  topology  may  be  varying  quite 
rapidly,  and  so  this  approach  will  not  work.  We  therefore  assume  the  transmission  radii  to 
be  arbitrary  and  fixed. 


1.2.  Miiltiaccess  Methods  and  the  Hidden  Terminal  Problem 


The  throughput  of  a  broadcast  network  depends  critically  on  the  method  its  users 
employ  to  access  the  channel.  It  is  clear  that  if  all  users  were  to  broadcast  at  random  the 
large  number  of  collisions  would  result  in  a  poor  level  of  thoughput.  Many  schemes  have  been 
suggested  for  broadcast  networks  in  general.  Carrier  sensing  methods  have  been  foimd  to  be 
particularly  attractive.  By  carrier  sensing  we  refer  to  protocols  in  which  the  terminal  listens 
to  the  channel  to  check  if  another  terminal  is  broadcasting,  and  only  transmits  a  packet 
when  the  channel  is  sensed  to  be  idle.  An  important  assumption  made  in  the  analysis  of 
these  schemes  is  that  the  terminals  are  within  mutual  line  of  sight  (los) ,  and  within  range  of 
each  other.  Unfortunately,  this  does  not  apply  to  many  PRN’s.  In  fact  it  is  possible  for  two 
terminals  to  be  within  los  but  not  within  range  of  each  other,  or  for  them  to  be  obstructed 
by  an  object  which  blocks  radio  waves,  such  as  a  hill  or  a  tail  building.  Such  terminals 
are  said  to  be  hidden  from  each  other.  It  h^l5  been  shown  that  hidden  terminsLls  seriously 
degrade  the  throughput  performances  of  traditional  carrier  sensing  methods  [6].  Intuitively 
this  effect  can  be  explained  by  observing  that  when  a  terminal  senses  the  channel,  it  has 
no  way  to  determine  if  a  hidden  terminal  is  broadcasting  at  the  time. 

Tobagi  and  Kleinrock  [6]  have  suggested  a  Busy  Tone  solution  to  this  problem 
by  proposing  the  location  of  a  central  station  such  that  all  terminals  are  within  los  and 
mutual  range  of  it.  The  station  transmits  a  Busy  tone  on  a  specially  set  up  busy-tone 
channel  to  all  terminals,  as  long  as  it  senses  a  transmission  on  the  multiaccess  channel.  The 
terminals  transmit  only  when  they  do  not  sense  the  busy  tone.  It  has  been  shown  that  this 
solution  is  able  to  compensate  substantially,  for  hidden  terminal  effects.  A  ffxed  station  is 
also  useful  in  cases  when  non-random  access  to  the  channel  is  desired  such  as  polling,  or 
centralized  reservations,  which  suggests  that  whenever  such  controllers  may  be  constructed 
economically,  they  should  be.  However,  as  noted  earlier,  many  important  applications  of 
PRN’a  are  m  necessarily  stationiess  environments,  and  it  is  not  clear  now  to  resolve  the 


Hidden  terminal  problem  in  auch  situations. 

1.3.  Linked  Clxister  Architecttirea 

The  ideas  in  the  previous  section  provide  the  motivation  for  this  scheme,  which  was 
first  proposed  by  Baker  and  Ephremides  [5].  It  is  reasonable  to  try  to  enhance  the  confiict 
resolution  capability  of  a  stationless  PRN  by  organizing  it  into  clusters,  each  of  which  has  a 
node  designated  as  the  leader  or  head  of  that  cluster.  This  node  is  bidirectionally  connected 
to  every  other  node  in  the  cluster,  and  acts  as  a  local  station  or  controUer.  ^  Each  node 
is  in  some  cluster,  and  therefore  is  controlled  by  some  cluster  head.  The  hidden  terminal 
problem  is  solved  within  a  cluster  using  Busy>Tone  Multiple  Access  or  some  non>random 
scheme  such  as  polling. 

Observe  that  in  general  it  is  neccessary  to  have  more  than  one  cluster  in  a  network, 
since  the  transmission  radii  of  the  radios  are  fixed,  and  it  is  possible  that  no  radio  is 
connected  to  all  the  other  nodes.  This  brings  up  the  problem  of  inter-cluster  communication, 
which  is  handled  by  linking  adjacent  cluster  leaders  through  other  nodes  caUed  gateway 
nodes.  A  gateway  node  is  necessary  if  two  adjacent  cluster  leaders  are  not  within  range 
of  each  other.  Thus  the  organization  of  the  PRN  consists  of  possibly  overlapping  clusters, 
whose  leaders  form  a  backbone  network,  and  are  linked  by  gateway  nodes.  It  should  be 
understood  that  since  the  nodes  are  mobile,  any  node  may  become  a  cluster  leader,  or  a 
gateway  node,  and  that  the  status  of  a  node  will  in  general,  keep  changing.  An  algorithm 
which  organizes  the  nodes  of  a  PRN  in  this  manner  must  not  depend  on  the  existence  of  a 
particular  node  in  a  particular  region,  thus  forcing  it  to  be  distributed  in  nature. 
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If  node  t  can  hear  node  j  then  we  say  that  j  is  accessible  from  t.  F^om  now  on  two 


nodes  are  defined  to  be  connected  iff  they  are  accessible  from  each  other. 


1.4.  The  Case  For  Mlzumums  The  Ntimber  of  Clusters 


The  major  diaereace  between  the  solutions  proposed  for  the  Hidden  Terminal  prob* 
lem  in  sections  1.2  and  1.3  is  that  there  are  many  controllers  in  the  stationless  ennronment. 
This  leads  to  complications  which  arise  &om  the  existence  of  ^atewaf  nodes,  and  the  pos¬ 
sibility  of  one  node  being  contained  in  several  clusters. 


Fir.  1.2.fal.  fb)  and  fcl 


These  difficulties  are  explained  with  reference  to  the  network  represented  by  the 
graph  in  Fig.  1.2(a).  The  nodes  represent  radios,  and  the  links,  bidirectional  connections. 
Notice  that  transmissions  &om  nodes  1 . . .  m  conffict  with  each  other,  as  do  those  from 
nodes  m  1 . . .  2m.  Nodes  m  and  m  -j- 1  also  form  a  confficting  pair.  Any  Linked-Cluster 
organization  of  this  network  must  have  at  least  2  clusters  since  no  single  node  is  connected  to 
all  the  other  nodes,  figures  1.2(b)  and  (c)  represent  2  possible  Linked-Cluster  organizations 


for  this  network. 


First  let’s  focus  on  Fig.  1.2(b).  There  are  2m  -  2  clusters;  the  cluster  is  led  by 
Li.  Notice  that  the  cluster  contains  2  nodes;  Li  and  either  Gi  or  Gj.  The  G  nodes  are 
each  physically  contained  in  m  clusters  and  also  serve  as  the  gateway  nodes  of  the  network. 
Now  let  us  examine  the  question  of  which  cluster  leader(s)  should  control  the  gateway  node, 
Gx-  Suppose  that  only  one  of  the  cltister  leaders,  say  Lx,  controls  the  transmissions  of  Gx- 
Then  in  a  busy-tone  solution,  G  x  would  listen  for  the  busy-tone  of  Lx  and  would  transmit 
whenever  the  busy-channel  is  idle.  But  this  could  cause  collisions  in  any  of  the  other  clusters 
led  by  L)  . . .  Lm-i,  unless  different  clusters  use  different  frequency  bands.  The  allocation 
of  these  frequency  bands  to  the  various  clusters  is  not  a  straightforward  problem,  and  has 
barely  been  solved  for  applications  with  fixed  stations[l7].  Since  the  number  of  clusters  in  a 
stationless  environment  is  constantly  changing,  an  upper  bound  on  the  maximum  of  clusters 
possible  depends  on,  among  other  things,  the  number  of  nodes  in  the  network.  Since  any  2 
clusters  may  overlap,  the  number  of  bands  the  access  channel  is  to  be  divided  into,  depends 
in  turn  on  the  number  of  clusters.  Hence  for  large  networks  the  number  of  frequencies 
reqxiired  may  become  ridiculously  large.  At  any  rate,  there  are  a  number  of  reasons  why 
the  number  of  frequency  bands  the  radio  can  operate  at  should  be  kept  down  to  a  bare 
minimum,  and  so  it  is  highly  desirable  to  avoid  the  collisions  in  some  other  way.  Notice 
that  similar  collisions  would  occur  even  if  a  non-random  access  scheme  such  as  polling,  or 
reservations  were  used  within  each  cluster.  Thus  the  organization  of  fig.  1.2  is  not  amenable 
to  any  efficient  solution  to  the  hidden  terminal  problem.  It  is  important  to  note  that  many 
of  these  defects  can  be  minimized  using  spread  spectrum,  but  this  introduces  the  problem  of 
knowing  which  pseudo-noise  keys  to  listen  for,  and  arguments  similar  to  those  made  above, 
now  come  into  play  in  resolving  this  problem. 

On  the  other  hand  if  Gx  is  controlled  by  all  of  the  cluster  leaders  it  is  connected 
to,  then  in  a  busy-tone  solution  it  can  only  transmit  when  it  does  not  sense  traffic  in  any 
of  the  clusters  Lx  ■ . .  Lm-i-  This  is  an  unfair  throttling  of  Gx-  i.e.  the  throughput  at  node 


Gx  would  be  very  low  compared  to  that  of  the  nodes  Lx  ■ .  •  L^-i- 


Another  severe  problem  in  the  Linked-Cluster  scheme  of  Fig.  1.2.  is  that  the  gate¬ 
way  nodes  are  common  to  many  pairs  of  clusters.  In  fact,  no  inter-cluster  communication 
can  occur  without  the  message  being  routed  through  at  least  one  of  the  gateway  nodes. 
This  implies  that  numerous  collisions  can  occur  at  Gi  and  Gj,  since  inter-cluster  messages 
are  always  sent  in  an  uncoordinated  manner. 

Observe  how  nicely  these  defects  are  corrected  in  the  organization  of  the  nodes  in 
Fig.  1.2(c).  Here  there  are  only  2  clusters,  and  no  gateway  nodes.  The  only  nodes  which  are 
physically  contained  in  more  than  one  cluster  are  the  leaders  thenoselves.  This  organization 
is  superior  to  the  one  in  Fig.  1.2(b)  for  3  important  reasons: 

(a)  Clusters  are  1&  ger  in  size  (m  versus  2).  This  decreases  collisions  since  intra-cluster 
collisions  can  be  minimized  using  existing  techniques. 

(b)  The  protocols  for  inter-cluster  communication  are  much  simpler.  This  is  because 
the  gateway  nodes  have  been  eliminated. 

(c)  The  high  number  of  collisions  which  occured  at  gateway  nodes  has  been  elinoinated 
since  there  are  no  gateway  nodes. 

This  example  illustrates  the  point  that  a  Linked-Cluster  Algorithm  must  try  to 
organize  the  nodes  so  that  there  are  few  clusters,  and  as  few  gateway  nodes  as  possible.  In 
this  thesis  we  will  exanoine  the  important  tradeoffs  of  minimizing  the  these  quantities  with 
the  communication  and  computational  complexities  of  algorithms. 


1.5  Previous  Work 


The  only  algorithm  in  the  literature  which  organizes  mobile  radios  into  a  Linked- 
Cluster  Architecture  is  due  to  Baker  and  £phremide8[5].  It  was  developed  specifically  for  the 
Navy’s  HF  Intra  Task  Force  (ITF),  which  is  a  general  purpose  PRN  providing  extended  line 
of  sight  commimications  in  the  sea.  While  this  distributed  algorithm  is  easy  to  implement, 
it  has  a  number  of  drawbacks  which  restrict  its  utility  to  a  very  small  range  of  applications. 
A  major  aim  of  this  thesis  is  to  suggest  ways  to  overcome  these  rather  serious  shortcomings, 
and  so  the  algorithm  (which  we  will  call  ITF  for  a  lack  of  a  better  name),  is  now  presented 
and  analyzed  in  some  detail. 

ITF  assumes  that  the  nodes  are  numbered  from  1  to  N’,  where  N  never  changes.  The 
access  channel  is  controlled  by  Time  Division  Multiplexing  the  users.  Time  is  divided  into 
epochs,  and  each  epoch  into  2  frames.  A  &ame  consists  of  N  slots,  and  node  t  broadcasts 
in  the  slot.  The  details  of  what  exactly  is  broadcast  in  the  slots  of  these  frames  are 
not  important  and  can  be  gotten  from  [5j.  Suffice  it  to  say  that  by  the  end  of  the  second 
frame  each  node  has  the  following  topological  information:  it  knows  the  nodes  to  which  it  is 
connected,  and  the  nodes  to  which  these  nodes  are  connected.  Based  on  this  information  a 
node  declares  itself  to  be  either  a  cluster  leader,  a  gateway  node,  or  just  an  ordinary  node. 
If  it  is  a  gateway  it  knows  which  clusters  it  links,  and  if  it  is  a  cluster  leader,  it  knows  all 
its  gateway  nodes. 


The  actual  algorithm  used  to  form  the  clusters  is  a  distributed  version  of  the  fol¬ 
lowing  simple  centralized  procedure:  Declare  to  be  a  cluster  leader.  Then  if  IV  -  1  is 
bidirectionally  connected  to  any  node  including  itself  that  N  is  not  connected  to,  then  de¬ 
clare  iV  -  1  to  be  cluster  leader.  Similarly  check  if  AT  -  2  is  connected  to  any  node  which 
is  not  connected  to  N  and  iV  —  1.  If  so,  declare  IV  —  2  a  leader.  The  procedure  continues 
like  this  until  all  nodes  are  connected  to  the  set  of  declared  cluster  leaders.  An  example  is 
shown  in  Fig.  1.3. 


Eit,.lr3 

Leaden  are  selected  m  the  order  11,  10,  7,  6,  S. 

Unfortunately,  there  arc  a  number  of  difficulties  with  the  algorithm,  some  of  which 
were  recognized  and  disaissed  by  the  authon  themselves.  We  list  these  previously  recog> 
nized  problems  below: 

(i)  Clusten  may  be  contained  within  other  clusten. 

(u)  Several  pain  of  nodes  may  declare  themselves  to  be  gateway  nodes  between  the 
same  pair  of  clusten. 

(iii)  Asymmetric  gateway  links  are  possible,  in  which  one  of  the  nodes  in  a  gateway  pair 
is  not  aware  that  the  other  is. 

In  addition,  there  remain  three  major  problems  which  have  not  been  mentioned. 
The  fint  has  to  do  with  the  fact  that  the  number  of  usen  must  be  fixed.  In  any  PRN 
nodes  are  bound  to  fail,  and  other  nodes  may  be  needed  to  be  added  to  the  network.  The 
algorithm  does  not  allow  for  such  additions  and  deletions. 

The  second  problem  is  that  of  inefficiency  of  time.  The  channel  is  time  division 


multiplexed,  so  that  only  one  user  can  broadcast  at  a  time.  However,  it  is  sufficient  to 
ensure  that  only  users  less  than  three  hops  away  do  not  broadcast  at  the  same  time,  to 
avoid  collisions  completely.  When  the  network  is  large  and  sparsely  connected  it  might  be 
beneficial  to  risk  some  collisions,  and  to  then  resolve  them  using  a  contention  based  scheme, 
rather  than  to  TDMA  the  channel. 

The  third,  and  most  important  problem,  from  the  point  of  view  of  this  thesis,  is 
that  the  algorithm  does  not  take  into  account  the  fact  that  it  is  essential  to  try  to  restrict 
the  number  of  clusters  and  gateway  nodes.  In  Fig.  1.4.  the  results  of  cluster  organization 
are  presented  using  ITF.  The  reader  may  verify  these  results  by  applying  the  centralized 
algorithm  illustrated  in  Fig.  1.3.  It  is  seen  that  ITF  performs  miserably  sometimes,  and 
is  then  no  better  than  the  case  in  which  every  node  in  the  network  is  a  cluster  leader,  and 
broadcasts  messages  at  random! 

From  these  remarks  one  may  conclude  that  the  Hidden  terminal  problem  has  not 
been  solved  for  mobile,  stationless  PRN’s,  and  that  much  work  needs  to  be  done  in  devel¬ 
oping  distributed  Linked-Cl  aster  algorithms  which  try  to  minimize  the  number  of  clusters 
and  gateway  nodes  in  the  PRN. 

1.6  The  PRN  Model 

We  now  present  a  model  which  is  reasonable  in  view  of  the  facts  and  arguments 
presented  in  earlier  sections.  It  is  assumed  that  a  PRN  consists  of  N  identical  mobile  radios, 
distributed  in  a  plane.  Radios  will  be  interchangeably  referred  to  nodes  and  terminals.  Each 
radio  has  an  integer  identity  number  and  consists  of  a  receiver,  with  an  omnidirectional 
antenna,  and  a  transmitter  with  a  fixed  transmission  radius.  All  the  radios  need  not  have 
the  same  radius  of  transmission.  Messages  are  sent  singly  by  a  terminal  in  framed  packets 
of  equal  length.  The  packets  contun  header  information  such  as  the  origin  and  destination, 
neccessary  for  multihop  routing.  If  several  packets  ue  received  at  a  node  simultaneously, 


all  of  them  are  assumed  to  be  destroyed,  and  must  be  retransmitted.  Lower  level  protocols 
such  as  acknowledgement  schemes  (ARQ’s)  which  intimate  the  sender  of  a  lost  packet  are 
not  assumed  to  be  present.  Finally,  the  PRN  is  always  connected;  there  is  a  path  between 
all  origin>destination  pairs  at  all  times. 

1.7.  Outline  of  the  Thesis 


There  are  two  main  parts  to  this  thesis.  The  first  deals  with  the  tradeoff  of  min¬ 
imizing  the  number  of  clusters  and  gateway  nodes,  with  the  eomputational  complexity  of 
the  problem.  The  problem  is  formulated  in  graph-theoretic  terms,  and  analyzed  bom  a 
centralized  standpoint  (i.e.  global  information  about  the  topology  is  assumed.)  The  re¬ 
sulting  optimization  problems  are  found  to  be  NP-complete,  forcing  us  to  look  for  efficient 
(polynomial  time)  heuristics.  Considerable  attention  is  given  to  measuring  the  worst  case 
performances  of  such  heuristics,  and  it  is  shown  that  it  is  extremely  unlikely  that  an  efficient 
algorithm  exists  which  guarentees  a  solution  that  comes  to  even  a  constant  of  the  optimum. 
Then  some  heiiristics  are  presented  and  analyzed  which  are  computationally  efficient,  and 
which  do  a  "reasonable”  job  in  approximating  the  optimum  answer. 

The  second  part  (Chapter  3)  consists  of  distributed  versions  of  the  centralized 
heuristics  analyzed  in  Chapter  2.  At  this  point  special  emphasis  is  placed  on  the  com¬ 
munication  complexity  of  the  algorithms,  and  on  actual  implementation  in  PRN’s. 


Finally,  Chapter  4  is  a  brief  conclusion,  and  contains  suggestions  for  further  work. 


CHAPTER  n 

COMPUTATIONAL  COMPLEXITY  OF  THE  CENTRALIZED  PROBLEM 


Any  Linked-Cluster  zilgorithm  must  be  able  to  organize  the  terminals  of  a  PRN 
efficiently,  since  the  nodes  are  mobile  and  the  topology  of  the  network  may  be  changing  quite 
rapidly.  It  has  already  been  argued  that  the  number  of  clusters  and  gateway  nodes  should 
be  minimized.  However,  the  computational  implications  of  such  an  approach  have  not  been 
discussed.  In  this  chapter  we  will  address  such  issues  as  to  the  degree  to  which  it  is  possible 
to  minimize  these  quantities  without  lapsing  into  the  realm  of  inefficient  (exponential  time) 
algorithnos. 

In  the  centredized  problem,  all  the  nodes  are  omniscient  and  thus  have  complete 
information  about  the  topology  of  the  network.  Also,  computations  are  performed  sequen* 
tizdly,  not  in  parallel.  It  is  easy  to  see  that  if  the  best  algorithm  which  solves  the  centralized 
problem  runs  in  time  T,  then  no  distributed  algorithm  can  solve  the  problem  in  less  than 
time  where  N  is  the  number  of  nodes  in  the  network.  A  direct  implication  of  this  is  that 
if  no  polynomial  time  centredized  algorithm  exists,  there  can  be  no  distributed  algorithm 
which  runs  in  polynomied  time  either.  Hence  it  is  profitable  to  first  study  the  problem  from 
a  centralized  standpoint,  since  negative  results  apply  to  the  distributed  case  as  well. 
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2.1.  Formulating  the  Problem 


Recall  that  a  cluster  leader  must  be  bidirectionally  connected  to  all  of  the  nodes 
in  its  cluster.  Hence,  we  will  be  concerned  only  with  such  bidirectional  adjacencies  in  the 
network.  Given  a  network  topology  at  some  time  instant,  define  a  graph  Giy,  E)  as  follows: 

V  =  the  set  of  users  {1,2...//} 

E  =  {(i,j)  :  1  can  hear  3  and  j  can  hear  1}. 

Observe  that  a  complete  set  of  cluster  leaders  is  a  set  of  nodes  such  that  every  node  not  in 
the  set  is  adjacent  to  at  least  one  node  in  the  set.  This  is  known  as  a  Dominating  Set  in 
Graph  Theory,  and  is  formally  defined  as  follows: 

Definition  2.1.  A  Dominating  Set,  D,  of  a  graph  G(y,  J?)  is  a  set  of  nodes  such  that 
Vj  €  (V  -  D)3i  €  D  s.t.  (1,3)  €  E. 

II  the  number  of  clusters  is  to  be  minimized  without  bothering  about  the  number  of 
gateway  nodes,  one  could  look  for  the  minimum  cardinality  dominating  set.  If  no  gateway 
nodes  are  desired,  then  the  dominating  set  should  be  one  of  minimum  cardinali^,  such  that 
the  graph  it  induces  is  connected.  A  compromise  between  these  two  approaches  is  to  find  a 
dominating  set  of  minimum  cardinality  such  that  for  every  pair  of  nodes  in  the  set  there  is 
path  which  connects  them,  such  that  2  gateway  nodes  are  never  in  successive  order  in  the 
path.  These  three  formulations  of  the  problem  are  now  given  formally: 

Definition  2.2.  The  Minimum  Dominating  Set  problem  (DSP)  is  the  one  of  Snding  a 
minimum  cardinality  Dominating  Set,  D’ ,  for  a  graph  G. 

Definition  2.3.  The  Connected  Dominating  Set  problem  (CDSP)  is  the  one  of  hnding  a 
minimum  cardinality  Dominating  Set,  D*,  for  a  graph  G,  such  that  the  graph  induced  by 
D*  is  connected. 

Definition  2.4.  The  Semi-Connected  Dominating  Set  problem  (SCDSP)  is  the  one  of 
finding  a  minimum  cardinalty  Dominating  Set,  D*,  for  a  graph,  G,  such  that  there  is  a 


spvming  tree  of  G,  whose  edges  eech  have  tiie  property  that  at  least  one  endpoint  is  in  D*. 


Fig  2.1  illiistrates  the  different  formulations. 
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2.2.  Some  Definitions 

In  the  following  sections  we  will  prove  a  number  of  results  by  polynomial  many-one 
reductions  of  problems  widely  believed  to  be  intractable  to  the  problems  denned  above. 
These  intractable  problems  are  defined  here: 

Definition  2.5.  The  Vertex  Covering  Problem  {V C)  is  the  one  of  SndJng  the  minnmtim 
cardinality  subset  of  nodes,  C,  for  a  graph,  G{V,  E)  s.t.  V(t,y)  €  E,  (t  €  C  or  j  €  C). 

Definition  2.6.  The  Set  Covering  Problem  (SC)  is  deSned  on  a  ground  set,  T,  a  collection, 
S,  of  subsets  ofT,  and  is  the  one  of  Snding  a  minimum  cardinality  subset  P,  of  S  such  that 


Definition  2.7.  Tiie  Directed  Dominating  Set  Problem  (DDSP)  is  tiie  one  of  Bnding  a 
minimum  cardinality  djminating  set  in  a  directed  graph  C(V,A). 

Definition  2.8.  The  Maximum  Leaf  Spanning  TVee  Problem  (MLSP),  is  the  one  of  Bnding 
a  spanning  tree  of  a  graph  G(y,E),  which  contaiiu  a  minimum  number  of  non-leaf  nodes. 

We  will  also  need  the  concepts  of  closed  and  open  neighborhoods  of  a  node: 

N(i)  t  €  V:  the  set  of  nodes  adjacent  to  i. 

N(i)  ieV:  N(i)ui. 

W(S)  =  U,€5W(i). 

2.3  NP-Completeness  Results 

Readers  unfamiliar  with  proofs  of  NP-Completeness  are  refered  to  [13]  for  a  standard 
treatment  of  the  subject.  A  <x  B  means  that  any  instance  of  problem  A  can  be  converted 
to  an  ‘equivalent”  instance  of  S  in  time  polynomial  in  the  sise  of  the  instance  of  problem 
A. 

Lemma  2.1.  DSP,  CDSP,  and  SCDSP  are  NP-complete. 

Proof:  We  prove  the  lemma  in  3  parts: 

(a)  DSP:  We  show  VC  «  DSP.  Given  G{V,E}  define  G'{V',E')  such  that 
V'  =  V  u  {w<,  :  (i,y)  €  £^}  and  iT*  =  {(i, vvy)  :  (»,/)  6  (see  fig.  2.2.) 

Suppose  that  P  is  a  vertex  cover  in  G.  Then  P  is  a  dominating  set  in  G'  (assuming  that 
G  has  no  isolated  vertices)  since  P  must  be  a  dominating  set  in  G,  and  must  also  cover  ail 
nodes  of  the  form  in  G'. 

If  D  is  a  dominating  set  in  G',  then  so  is  the  set 

D^  =  {v  :  V  €  (V  n  D)}  U  {i  :  Vij  €  D}.  Now  observe  that  since  D^  covers  all  nodes  of 
the  form  v.y  in  G',  it  must  also  cover  all  edges  (i,j)  in  G,  thus  implying  that  it  is  a  vertex 
cover  of  G. 


We  can  then  conclude  that  G  has  a  VC  of  size  k 
Figure  2.2.  illustrates  the 


G  has  a  DS  of  size  k.  Done. 
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(b)  SCDSP:  VC  <x  SC  DSP-  This  follows  immediately  from  the  reduction  in  (a).  Observe 
that  the  dominating  set  in  G'  must  be  semi-connected,  else  at  least  one  of  the  v.-y’s  would 
not  be  covered. 

(c)  CDSP:  DSP  a  CDSP.  Given  G{V,E)  define  2  graphs  G^[V\E^)  and  G^V',E^) 
such  that:  G^  has  |V|  nodes,  {ci  ...C|v|},  and  forms  a  complete  graph;  and  G^  has  nodes 
{di,. . .  ,d\v\},  and  no  edges.  Then  define  the  graph  G'{V‘,E’)  such  that  V’  =  u 
V\E-  =  E^  'J  E*  U  {(c^,  d,)  :  a  €  dy  €  V\  {{  =  ;)  or  (f,  j)  €  E}. 

See  fig.  2.3.  for  an  illustrative  example. 

Suppose  G  has  a  DS  of  size  D.  Then  the  set  {a  :  «  €  D}  is  a  connected  DS  in  G*. 
On  the  other  hand,  suppose  G’  has  a  DS,  D’  =  C  i:  D  a.t.C  €  Vj,  D  €  V^.  Then  surely 
5  =  C  J  {c^  :  d,  €  D}  is  also  a  CDS,  since  N[d^)  C  iV(c^)  Vi.  Then  it  is  clear  that  .D  is  a 
DS  in  G.  Done. 


All  the  above  reductions  can  be  carried  out  in  time  bounded  by  0(1^]  —  |.EI),  i.e 
they  are  polynomial  time  reductions.  To  see  that  the  problems  are  in  NP  observe  that 


2.4.  Measures  of  Performance  of  Heuristics 

Since  the  problem  of  finding  optimal  dominating  sets  is  NP-complete  we  can  safely 
assume  that  there  is  no  algorithm  that  solves  it  in  polynomial  time.  It  clear  that  we  are 
forced  to  trade  optimality  with  time,  but  exactly  in  what  proportion  is  less  obvious.  Several 
methods  of  dealing  with  NP-complete  problems  are  to  be  found  in  the  Literature,  some  more 
appropriate  than  others  to  our  problem. 

Some  algorithms  run  in  polynomial  time  for  most  cases,  but  are  inefficient  in  solving 
a  small  number  of  ‘atypical”  instances.  The  Simplex  algorithm  for  Linear  Programming 
is  such  an  algorithm.  ’  The  main  reason  why  such  an  approach  is  not  appropriate  for 
our  problem  is  that  there  is  a  possibility  of  the  algorithm  talcing  an  inordinate  amount  of 
time  in  some  cases.  In  a  mobile  environment,  one  of  most  essential  attributes  of  a  network 
organizing;  algorithm  is  speed. 

Heuristics  have  also  been  analyzed  with  respect  to  their  average  ease  performance. 
This  becomes  possible  when  there  is  reason  to  believe  that  the  instances  encountered  in 
practice  obey  some  probability  distribution  In  this  case  good  algorithms  are  those  which 
run  in  polynomial  time  and  are  guaranteed  to  give  optimal,  or  near  optimal  solutions  for 
“almost”  all  problem  instances. 

The  notion  of  ‘almost  all”  has  been  quantified  as  follows:  Let  Sjf  be  a  given 
probability  distribution  for  all  instances  of  size  N.  Let  X{I)  be  a  boolean  variable  for 
some  condition  applicable  to  each  of  the  problem  instances.  For  example,  X{I)  can  be  the 
condition  that  a  given  algorithm  solves  the  problem  instance  I,  optimally.  Let  qft  be  the 
probability  that  X{I)  does  not  hold  for  a  randomly  selected  problem  instance  of  size  N. 
Then  X(/)  holds  almost  everywhere  if  qf/  <  oo. 

’  It  is  true  that  LP  is  not  NP-complete.  However,  the  behaviour  of  the  Simplex 
algorithm  is  consistent  with  the  one  we  are  describing. 


Now  let’s  examine  the  suitability  of  such  an  approach  to  our  problem.  Many  re* 
searchers  have  made  probabilistic  assumptions  on  the  topology  of  mobile  PRN’s.  For  exam* 
pie  Nelson  has  modeled  a  snapshot  of  the  network  as  a  Poisson  point  process  with  a  mean 
density  of  X  terminal8[3].  The  radios  have  identical  transmitting  radii.  Gallager  [16]  has 
suggested  choosing  bidirectional  linka  between  terminals  with  a  probability  dependent  on 
their  distance  &om  each  other.  Hence,  one  might  try  an  average  case  analysis  of  algorithms 
based  on  a  reasonable  probability  distribution  for  the  topology  of  the  network.  However, 
there  are  2  problems  which  remziin; 

(1)  The  probabilistic  analysis  of  algorithms  is  usually  extremely  cumbersome  and  in¬ 
volved  for  the  simplest  of  algorithms,  as  a  consequence  of  which  results  have  not 
been  forthcoming  in  this  field[13].  There  is  no  reason  to  suppose  that  this  will  not 
hold  for  our  problem  as  well. 

(2)  Suppose  that  an  algorithm  is  found  to  be  optimal  almost  everywhere.  This  does  not 
mean  that  its  performance  is  guaranteed  to  be  within  any  reasonable  limits  for  an 
infinity  of  “atypical*’  instances.  Suppose  the  PRN  reaches  such  a  configuration,  and 
the  radios  do  not  move  significant  distances  for  a  while  i.e.  the  atypical  configuration 
holds  for  a  long  time.  Then  very  poor  organization  of  the  network  would  persist 
for  this  period,  resulting  in  numerous  collisions,  and  considerable  inconvenience  to 
the  users. 

In  this  thesis  efficient  heuristics  will  be  analyzed  from  a  more  conservative  stand¬ 
point.  The  performance  is  based  on  the  worst-case  situation,  thus  bounds  on  the  running 
time  and  error  are  guaranteed.  The  error  may  be  measured  as  a  constant  difference  be¬ 
tween  the  achieved  and  optimal  solutions  [differential  error),  or  as  a  constant  fraction  of 
the  optimal  solution  [fractional  error).  The  next  Theorem  shows  that  no  efficient  heuristic 
can  guarantee  a  constant  differential  error  for  our  problem. 

Theorem  2.1(a).  Let  A  be  aa  efficient  heuristic  for  DSP,  do  be  the  cardinality  of  an 
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solution  returned  by  A  on  the  graph  G,  and  Kq  be  the  cardinality  of  an  optimal  solution 
of  DSP  for  G.  Tien  unless  P  —  NP,  there  is  no  integer  M  such  that 


do  -  Ka  <  Af,VG. 


Proof:  Suppose  there  is  an  efficient  heuristic  B,  for  which  the  theorem  does  not 
hold.  Then  there  must  be  an  integer,  M',  for  which  the  inequality  holds.  Now  consider  the 
graph,  ff,  which  is  made  up  4- 1  copies  of  some  arbitrary  graph,  G,  and  apply  B  to  H. 
By  assumption: 

dH-KH<  M'  (2.1) 

But  =  {M'  +  \)Ka-  Since  the  copies  of  G  are  disconnected  in  all  the  nodes  in  any 
particular  copy,  are  dominated  exclusively  by  nodes  in  the  same  copy.  Hence  the  nodes 
selected  by  B  in  any  copy  form  a  dominating  set  of  (7. 

Now  let  the  cardinality  of  the  smallest  complete  set  of  nodes  selected  in  any  copy  be  7.  It 
is  clear  that  Substituting  expressions  for  dg  and  Kh  in  2.1.  we  have 

7(Af'  4- 1)  -  {M'  4-  l)ii:c  <  M' 


which  implies  that 


l-Ko< 


M' 


<  1. 


^'4-1 

Since  the  difference  on  the  LHS  must  be  an  nonegative  integer,  we  have: 


7  -  ffc  =  0, 


which  means  that  B  picks  the  optimal  DS  for  any  G,  and  that  P=NP.  Done 
Theorem  2.1(b).  Theorem  2.1(a)  holds  for  CDSP  and  SCDSP. 


Proof:  Similar  arguments  as  in  part  (a). 

This  negative  result  leads  us  to  examine  the  case  when  error  is  defined  to  be  a 
constant  fraction  of  the  optimum.  Efficient  algorithms  with  such  error  bounds  are  called  e 
•Polynomial  Time  Approximate  Algorithms,  and  ue  defined  in  the  next  section. 


2.5.  e  -Polynomial  Time  Approximate  Algorithms 


Definition  2.9.  Let  A  be  ad  optimisAtioa  problem  with  positive  integrel  cost  function  c, 
sod  let  ^  be  an  algorithm  which,  given  an  instance  /  of  A,  returns  a  feasible  solution  //{I); 
denote  the  optimal  solution  of  I  by  /(/).  Then  A  is  called  an  e-approximate  algorithm  for 
A  for  some  c>0  iff 


\ciUr)  -  c(/(/))|  , 

c(/(/)) 


for  all  instances  I  [13]. 


The  next  result,  another  negative  one  shows  that  it  is  extremely  unlikely  that  an  e 
-PTAA  exists  for  DSP,  or  for  DDSP. 

Theorem  2.2.  There  exists  an  e-PTAA  for  DSP  there  exists  an  c-PTAA  for 

SC  there  exists  an  €  PTAA  for  DDSP. 


Proof;  (a)  PTAA  DSP  <=>  PTAA  SC: 

(i)  (=>):  Consider  an  instance  of  DSP,  G{V,E),  and  suppose  there  is  an  e-approximate 
algorithm  to  solve  SC.  Define  the  following  instance  of  SC: 

r  =  V,  5  =  {iV(0  U  i  s.t.  I  6  V} 

It  is  clear  that  the  SC  instance  has  the  same  solution  as  the  DSP  instance.  So  by  using  the 
approximate  algorithm  on  the  SC  problem,  we  can  get  an  e-approximate  solution  for  the 
DSP  instance. 

(ii)  {^):  Consider  an  instance  of  SC,  (F,  S),  and  suppose  there  is  an  e-approximate  algo¬ 
rithm  to  solve  DSP.  First  define  the  two  graphs  and  G*  where 

G^  is  a  clique  consisting  of  nodes  =  {ci,...C|j|},  C*  consists  of  nodes  V®  =  {?i,...,g|ri},| 
and  no  edges. 

Next  let  G{V,E)  be  such  that  V  =  V*  u  V*,  E  =  U  £?*  u  {{ci,qj)  :  a  €  V‘,gy  € 
V*,  j  €  Si)  (See  figure  2.4.) 

If  P  is  a  solution  to  the  SC  problem,  then  {c^  :  S,  €  P}  is  a  DS  in  G.  Also,  if 


D  =  C  U  Q,  C  C  V^,  <3  C  V*,  then  soia5  =  Cu{ci:a<€  D}.  But  D  ia  clearly  a  set 
cover  for  (F,  S). 

We  see,  that  the  instance  of  DSP  has  the  same  solution  as  that  of  SC,  and  so  using 
the  approximate  algorithm  on  the  DSP  instance,  we  get  an  e-approximate  solution  for  the 
3C  instance.  Done. 

(b)  PTAA  DSP  <=>  PTAA  DDSP: 

(i)  (  PTAA  for  DDSP  =>  PTAA  for  DSP) 

Suppose  there  is  an  <  -PTAA  for  DDSP.  Then  given  an  instance  of  DSP,  G,  do  the  following 
to  the  graph; 

Replace  every  edge  (ij),  with  2  directed  edges  (ij)  and  (j4)-  Now  apply  the  PTTA  for 
DDSP,  to  the  modified  graph.  It  is  clear  that  the  original  graph  G,  and  its  modified 
directed  version  share  exactly  the  same  set  of  dominating  sets,  and  so  the  result  follows: 

(ii)  (  PTAA  for  SC  =>  PTAA  for  DDSP): 

Suppose  there  is  an  e-PTAA  for  SC.  Then  given  an  instance  of  DDSP,  G[V,A)  define: 


T  ^  V,  and  S=  s.t.  5,  =  {j  :  (.',/)  €  A}  U  {j} 


It  is  clear  that  the  SC  instance  has  the  same  solution  as  the  DSP  instance,  so  by  using 
the  approximate  algorithm  on  the  SC  problem,  we  can  get  an  e*approximate  solution  for 
the  DDSP  instance. 

Hence,  PTAA  SC  =>  PTAA  DDSP  =>  PTAA  DSP  =>  PTAA  SC.  Done. 

Q.E.D. 

Theorem  2.2  is  a  rather  negative  result  since  the  Set  Covering  Problem  has  been 
studied  extensively  [10],  [11],  [12]  and  no  r-approximate  algorithm  has  been  found.  In  the 
next  lemma,  it  is  shown  that  CDSP  is  at  least  as  hard  as  DSP  to  find  a  £>PTAA  for,  and 
so  we  have  reasons  to  be  pessimistic  for  CDSP  as  well.  Interestingly,  MLSP  is  also  at  least 
as  hard  as  DSP. 

Theorem  2.3.  There  exists  an  c-approximate  algorithm  for  CDSP  ■^=>  there  exists  an 
e-approximate  aJgorithm  for  MLSP  =>  there  exists  an  e-epproximste  algorithm  for  DSP. 

Proof;  (a)  PTAA  CDSP  <=>  PTAA  MLSP 
The  stronger  statement  that  the  2  problems  are  eq\iivalent  is  proved,  i.e.  CDSP  <=> 
MLSP 

Let  D*  be  a  solution  to  CDSP  for  a  graph  G.  It  is  claimed  that  D*  is  also  a  set  of  non-leaf 
nodes  for  a  maximum  leaf  spanning  tree  of  G.  Suppose  not.  Then  there  is  a  spanning  tree 
with  more  leaf  nodes.  Let  the  set  of  non-leaf  nodes  for  this  tree  be  L.  Observe  that  L  must 
form  a  dominating  set  in  G  since  each  leaf  node  is  connected  to  at  least  one  non-leaf  node. 
Also,  the  graph  induced  by  L  is  connected  since  the  spanning  tree  is  connected.  Then,  L  is 
a  CDS.  But  L  has  a  smaller  cardinally  than  D^  by  assumption,  implying  that  D‘  is  not  a 
solution  of  CDSP.  The  contradiction  proves  the  claim. 

Now  suppose  that  T  is  a  solution  to  MLSP,  and  let  L  be  the  set  of  non-leaf  nodes 
for  the  solution.  We  claim  that  L  is  also  a  minimum  cardinality  CDS.  From  previous 
arguments  we  know  that  L  is  a  CDS.  Suppose  it  is  not  of  minimum  cardinality.  Then  let 


D*  be  a  CDS  with  smaller  cardinality.  We  construct  a  spanning  tree  as  follows:  The  nodes 
D*  are  connected  by  any  appropriate  |D*j  —  1  edges,  which  exist  since  D*  is  a  CDS.  Now 
by  definition  of  DS  each  node  not  in  the  DS  can  be  assigned  to  a  single  adjacent  member  of 
the  set.  i.e.  we  can  make  the  nodes  not  in  the  DS  leaf  nodes  of  the  spanning  tree.  Observe 
that  none  of  the  nodes  in  D*  can  be  leaf  nodes,  since  the  set  is  of  miniTniiTn  cardinality. 
Then  by  assumption,  the  constructed  spanning  tree  has  a  set  of  non>leaf  nodes  of  smaller 
cardinaliqr  than  \L\.  The  contradiction  proves  the  claim. 

Figure  2.5  illustrates  these  arguments. 


(b)  PTAA  CDSP  =»  PTAA  DSP. 

This  follows  from  the  contruction  in  illustrated  in  Fig.  2.3.  Given  an  instance  of  DSP, 
G,  use  the  construction  to  convert  it  to  the  equivalent  instance  of  CDSP.  Then  apply  the 
PTAA  ioT  CDSP  to  the  modified  graph. 

In  view  of  these  2  results,  we  do  not  consider  it  worth  pursuing  e-approximate 
algorithms.  In  the  next  section  we  look  at  algorithms  for  DSP  which  are  such  that  the 
error  grows  very  slowly  with  the  number  of  nodes  in  the  graph.  Unless  there  is  rather  large 
variation  in  the  number  of  radios  in  the  PRN,  such  algorithms  would  minimize  the  number 


of  clusters  quite  efficiently. 


2.6.  Approximate  Algorithms  for  DSP-  The  Greedy  Heuristic 

It  has  been  strongly  argued  in  the  last  few  sections  that  Heuristics  for  DSP  with 
constant  or  constant  fractional  errors  do  not  exist.  These  negative  results  suggest  that  it 
might  be  worthwhile  to  adopt  an  average-case  analysis  approach.  However,  this  would  not 
be  neccessary  if  we  could  show  that  there  are  algorithms  which  have  small  fractional  errors 
which  grow  very  slowly  with  the  number  of  nodes  in  the  network.  Such  algorithms  are  in 
some  senM,  “almost  as  good”  as  e-PTAA’s.  Fortunately,  this  is  the  case  for  a  number  of 
heuristics,  some  of  which  are  presented  here. 

We  begin  by  analyzing  a  very  simple  greedy  heuristic.  Later  on  it  is  shown  that 
some  other,  more  refined  algorithms  do  not  do  any  better,  from  a  worst-case  standpoint. 
This  heuristic  operates  seqentially,  putting  into  the  dominating  set  the  least  numbered 
node  which  covers  the  maximum  number  of  uncovered  nodes  in  the  graph  at  each  iteration. 
Formally  we  have: 

Heuristic  Greedy  (GCVjjS?)) 

Step  0:  UNCOV  =  V,  COV  =  ^,  i  =  1,  DS  =  C(0  =  €  V. 

Step  1;  d<  =  min  arg  max,  |C(y)|,  DS  =  DS  U  di,  UNCOV  =  UNCOV  -  C{di) 

COV  =  COV  ^C{di). 

Step  2:  For  i  =  1  to  |V|, 

C{i)  =  {j-.j^UNCOVand  or  j  =  i} 

Step  3:  i i  +  l,If  COV  ^  V  then  goto  Step  1. 

Step  4:  Stop. 


Let  rtxi  be  the  number  of  uncovered  nodes  covered  by  di  when  it  is  picked  by  Greedy. 
And  let  |V|  =  iV,  if*  =  cardinality  of  a  minimum  DS  of  G.  Since  Greedy  tries  to  cover 


as  many  uncovered  nodes  as  it  can  it  every  iteration,  one  might  wonder  what  percentage  of 
the  nodes  it  must  cover  by  the  time  it  has  picked  the  optimum  number  of  nodes.  In  some 
cases  this  is  ICX),  but  since  the  algorithm  is  not  optimal,  there  will  be  instances  for  which 
the  first  Ko  do  not  do  quite  as  well.  The  next  result  provides  a  bound  for  this  percentage: 


Theorem  2.4.  For  all  graphs  G  we  must  have: 

K, 


N 


isl 


Proof:  Let  Ak  be  the  solution  to  the  following  optimization  problem, 

2k  =  max  ^  CijXij.  (2.2) 

ijev 

Y,  =  1  €  V.  (2.3) 

}€V 

E  (2.4) 

jev 


0<Xij<y,<l  (2.5) 

Xij,yj  €  INTEC ER.  (2-^) 

Define  Cij  I  (’’■?) 

^  (0,  o.w. 

This  is  a  formulation  of  the  k*Median  Problem  with  an  assignment  of  edge  costs  such  that 
it  can  be  applied  to  the  solution  of  DSP.  Pk  returns  the  set  of  k  nodes  which  covers  the 
most  number  of  nodes  in  the  graph. 

Let  u  =  {ui . . .  Unv)  be  multipliers  for  the  constraints  of  the  problem  Pk-  Define: 

i.jev  iev  ;€v 
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=  {12(^0  -  ^)^i3  +  12  “‘  } 

j€V  i€V  iev 

So  the  Lagrangian  problem  is 

2ij(u)  =  maxL(*,y,  u) 

(*.s) 

subject  to  (2.4),  (2.5),  (2.6). 

Let  zo  s  miiiu  2x;(u)  be  the  Lagrangian  Dual.  Then  we  have  converted  this  problem  to  the 
form  dealt  with  in  [8]. 

We  can  now  use  the  following  useful  result  from  [8]. 

(*.D  -  -  -f/t)  <  1/e,  VP*  (2.8) 

where  zji  is  the  minimum  objective  value  of  P*,  i.e.  zji  =  0.  s,  is  the  number  of  nodes 
covered  in  the  graph  in  k  iterations  of  Greedy. 

Observe  that  zk,  —  N,  and  s*  <  N,'ik  <  Kp,  and  that  z*  =:  N,'ik  >  Kg.  We 
know  that  zd  <  Zk.  Also,  the  coefficient  matrix  of  P*  over  constraints  (2.4)  and  (2.5)  is 
easily  seen  to  be  totally  unimodular.  This  means  from  Theorem  2  of  [9]  that  zd  is  identical 
to  the  optimum  value  of  the  strong  LP  relaxation. 

From  this  fact,  and  from  (2.8)  we  conclude  that: 

K  K 

{«K.  -  12  =  (^  -  12  (12  >  (1  -  l/e) 

tKl  tSi 

and  the  result  follows. 

The  next  Lemma  help>s  to  prove  one  of  the  important  Theorems  of  this  chapter. 
This  theorem  gives  us  the  largest  possible  worst-case  fractional  error  for  Greedy: 

Lemma  2.2.  If  Greedy  returns  a  DS  =  {di...di-}  then:  (a)  —  N;  (b)  mx  > 

m2  >  ...  mi-  >  1;  (c)  +  Kgmp.fi  t  N  p  =  0, 1, ...  a"  -  1 


Proof:  (a)  and  (b)  follow  directly  from  the  definition  of  Greedy.  At  iteration  p  -r  1 
there  are  exactly  N  -  rm  uncovered  nodes.  Let  this  set  be  Sp.fi.  Now  by  definition 


of  Greedy  it  follows  that  dp^i  covers  the  maximum  number  of  nodes  in  ^P+i.  of  all  nodes 
in  G.  Now  consider  any  optimal  DS,  DS*.  The  average  number  of  nodes  in  ■5p+i  which  eire 
covered  by  nodes  in  DS*  is 

i^p-ni  _  N  -  Er=i 

Ko  ~  Ko 

But  this  average  is  by  definition,  at  most  equal  to  the  number  of  nodes  covered  by  dp+i, 
which  is  TTip+i.  Then  we  have: 

- ^  P  =  0,l,...,d-  -1. 

The  result  follows  directly. 

Theorem  2.5.  If  Greedy  returns  a  DS  of  cardinality  d*  for  a  gr&pb  GiV,  E),  with  optimum 


K.  >  2; 


d*  1  JV  N 


and  this  bound  is  attained  by  some  graph  for  all  values  of  Ko  >  2. 

Proof:  The  results  of  the  previous  lemma  we  used  to  find  a  bound  on  d* .  Let  T, 
be  the  minimum  number  of  nodes  which  could  ever  be  covered  edter  z  <  d*  iterations  of 
Greedy  on  a  graph  G.  T,  is  obtained  by  solving  the  following  LP: 

M 

T,  =  min  ^  rrii 


"»i  >  w»a  >  .  • .  >  >  1, 

p 

ffij  +  Kpmp^i  >  N  p  =  0, 1. 


(2.10) 


We  know  that  for  the  optimal  basic  feasible  solution  for  T,  exactly  z  constraints  must  be 
satisfied  with  equality.  Now  observe  that  first  constraint  of  the  type  (2.10)  lowerbounds  mi 
by  In  general  the  1**  constraint  of  (2.10)  lowerbounds  m<  in  terms  of  N,Ko  and  the 
values  chosen  for  mi, . . .  ,mp.  On  the  other  hsuid,  the  constraints  of  type  (2.9)  upperbound 
the  mt’s.  Hence  in  the  optimad  basic  feasible  solution,  in  which  the  sum  of  the  m^’s  is  to 
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be  minimized,  aa  many  constraints  as  possible  of  the  type  (2.10)  will  be  satisfied  without 


violating  any  of  type  (2.9.)  After  some  algebra  we  have; 

m,  =  m«{£(l-±r‘,  1}  .  =  1 . . 

For  z  =  d* ,  simplification  yields  that  rrn  =  1,  V»  >  where 

»«i«-l  +  log^ 

Now  by  summing  the  geometric  series  we  get, 

“I 

Y,  mi  =  N-K,. 

j=i 

But  we  want  to  find  d  :T^  =  N.  So,  we  have: 

Dividing  through  by  the  bound  follows. 


(2.11). 

(2.12) 

(2.13) 


We  srill  have  to  show  that  this  bound  is  achieved  for  all  iTa  ^  2.  This  is  done 
by  explicitly  showing  the  graph  for  Ko  —  2,  and  then  giving  a  procedure  to  generalize  the 
construction  for  arbitrary  values.  For  later  reference,  we  refer  to  this  class  of  graphs  as 
Bfc.„(V,  E)  where  k  =  and  |l^I  =  iV  = 


Fie.  2.6  ;  Bf  t 


Greedy  picks  {ji, . . . ,  js}  in  this  example  while  the  optimum  is  2.  Note  from  (2.12) 
that  =  4,  and  that  the  conditions  of  equation  2.11  zire  met  with  equality.  From  this  it 
follows  that  the  bound  of  Theorem  2.5,  which  is  2.5  in  this  case  must  be  attained  for  the 
fractional  error,  as  in  fzu:t  it  is.  The  construction  of  B2,n  is  now  described:  Since  there  are 
2^'^^  nodes  amd  Kg  =  2,  let 

yV(2"+i  _  1)  =  {1,2,.  ..,2"  -  1},  iV(2'‘+^)  =  (2", . . . , 2"+^  -  2}. 

The  adjeicencies  of  the  ji's  are  defined  so  that  Greedy  picks  them  in  the  order  of  their  indices 
i.e.  ji  is  picked  first,  then  etc.  Thus  the  number  of  uncovered  nodes  covered  by  ji  is  simply 
rrK.  Pick  ji  =  1  and  iV(l)  =  {2"+i  -  1, 2, . . . ,  2'*-‘  -  1, 2"  . . .  2"  +  2'*-i  -  1}.  It  is  clear  that 
=  2",  and  the  first  constraint  of  type  (2.10)  is  met  with  equality  ji  were  to  be  picked 
first  by  Greedy.  Now  observe  that  the  least  numbered  uncovered  element  of  iV(2'*''‘^)  is 

32"-‘.  Let;j  =  32'‘-iandiV(j2)  =  {2'*-‘...2'‘-‘+2'‘-*-1,;2+1,...,J2+2'‘-*-2,2'‘+i} 

It  is  easily  seen  that 

l^(ji)l  =  2"-'  =  iiV(2'*+')  -  N{ji)\  =  |iNr(2'*+i  -  1)  -  Ar(ji)|. 

Thus  we  see  that  j2  is  as  attractive  a  candidate  in  the  second  iteration  to  Greedy  as  are  the 
nodes  in  the  optimal  set,  after  ji  has  been  picked  in  the  first  iteration.  It  is  possible  to  give 
the  exact  expressions  for  the  remaining  j<’s  but  it  is  hoped  that  the  construction  is  clear 
enough  already.  The  following  conditions  must  hold: 

W)^NUn.)\=\:^]  |iV(i„.)|=[^],  m=l,...,d  i  =  N-l,N 

where  d  is  defined  in  (2.13),  and  [i]  is  the  reed  number  i  rounded  up.  Finally,  we  must 
generalize  our  construction  to  arbitrary  values  of  k.  There  are  N  =  nodes,  and  the 
optimal  set  is  {N  —  k  +  l,...iV}.  Each  of  these  nodes  has  a  closed  neighborhood  of  size 
k’*,  and  thus  the  neighborhoods  are  disjoint.  Now  pick  ji  to  be  a  node  in  N(N-k+l),  and 
define  edges  so  that  it  covers  exactly  nodes  from  the  neighborhoods  of  every  node  in 
the  optimal  set.  Thus  |iV(ji)|  =  fc".  The  reader  should  already  begin  to  see  the  emerging 


pattern  since  the  idea  is  identical  to  that  in  the  construction  of  Pick  ji  to  be  a 

node  in  N{N  -  k  +  1)  -  N{ji),  and  define  edges  so  that  it  covers  exactly  nodes  in 
the  neighborhoods  of  every  node  in  the  optimal  set,  such  that  these  nodes  (in  N{ji)),  are 
not  in  N{ji).  The  procedure  continues  analogously  for  the  remaining  jiS.  The  following 
conditions  must  hold: 

W0nJ?O«)l=[^l,  m  =  =  +  i . k. 

Done. 

This  result  is  important  for  2  reasons.  First  it  extends  work  on  an  equivalent  form 
of  Greedy  which  has  been  analyzed  for  the  Set  Covering  problem  by  Johnson,  Chvatal,  and 
Hoschbaum  [10],  [11],  [12].  In  [11]  a  best  bound  on  the  error  is  given  by  7>  where  6  is 
the  maximum  number  of  elements  in  any  member  of  S.  Since  DSP  is  a  special  case  of  SC, 
this  bound  holds  for  Greedy (G(V,E)),  but  the  bound  in  Theorem  2.5  is  tighter.  Note  that 
this  does  not  violate  Theorem  2.2  since  the  equivalence  holds  only  for  algorithms  which 
have  constant  fractional  error. 

Secondly,  the  result  tells  us  that  while  it  is  probably  not  possible  to  come  within  a 
constant  of  the  optimum  solution,  there  are  simple  algorithms  for  which  the  fractional  error 
grows  very  slowly  with  an  increase  in  N  or  K^.  This  relationship  is  plotted  in  figures  2.7(a) 
and  (b). 

Theorem  2.5  gives  a  bound  on  the  fractional  error,  but  does  not  provide  insight  into 
the  actual  number  of  nodes  picked  by  Greedy  in  terms  of  the  size  of  the  graph.  We  will  now 
show  that  a  famous  bound  due  to  Vizing[15],  for  Ko  nlao  holds  for  d*  \ 

Theorem  2.6.  ’  For  a  graph,  G,  with  N  nodes  and  M  edges: 

d’  <  N  +  l-  V2M  +  1.  (3.1) 

*  The  proof  of  this  theorem  has  been  suggested  by  Prof  Gallager. 


Proof:  First  convert  G  into  a  directed  graph  by  replacing  every  edge  (ij)  by  2 
directed  edges  (ij)  and  (jj),  and  by  adding  self-loops  (i,i)  at  every  node  t.  Interpret  a 
directed  edge  (i  j)  to  mean  that  the  &ee  node  j  would  be  taken  if  >  were  declared  cluster 
head.  We  can  interpret  Greedy  on  such  a  graph  as  follows:  Declare  the  node  with  the 
greatest  outdegree  a  cluster  head;  delete  all  edges  coming  into  the  neighborhood  of  that 
node;  and  if  there  are  any  edges  left,  declare  a  new  cluster  head,  else  stop.  To  see  that 
this  is  identical  to  Greedy,  interpret  a  directed  edge  (ij)  to  mean  that  if  i  were  picked  to 
be  leader,  the  previously  uncovered  node,  j,  would  be  covered  by  t.  Suppose  Greedy  picks 
node  di  in  the  iteration.  Let  5(t)  be  the  set  of  previously  free  nodes  which  were  taken 
by  di,  and  let  |5(t)|  =  rrii.  Finally,  define  Ei  to  be  the  number  of  edges  left  at  the  end  of 
the  iteration  coming  from  free  nodes  i.e.  Ei  does  not  include  edges  &om  TAKEN  nodes. 
Set  Eq  =  2M+  N. 

Consider  some  j  €  S(i).  |5(y)|  <  just  before  the  iteration.  Since  all  edges 
come  into  FREE  nodes,  and  j  is  also  FREE  before  the  i*^  iteration,  for  each  edge  coming 
into  J,  from  a  FREE  node,  there  is  also  an  edge  going  out  of  j  to  that  FREE  node.  Thus 
there  can  be  at  most  edges  coming  into  j  from  FREE  nodes,  emd  the  total  number  of 
edges  running  from  previously  TAKEN  nodes  to  members  in  5(t)  is  at  most  mf.  There 
may  also  be  edges  from  previously  TAKEN  nodes  to  members  of  5(i),  but  we  need  not 
consider  them,  since  we  are  coimting  only  edges  from  FREE  nodes. 

Now  observe  that  we  still  have  to  consider  the  edges  coming  from  S{i)  into  FREE 
nodes  which  are  not  in  5(t).  These  edges  are  not  deleted  by  Greedy,  but  they  are  not 
counted  in  the  definition  of  Ei.  It  is  clear  that  the  outdegree  of  every  node  in  S(i)  must 
be  <  *71^4.  X,  since  the  self  loops  of  all  such  nodes  has  been  deleted  in  the  step.  Thus 
the  number  of  outgoing  edges  from  S{i),  after  iteration  t  is  <  mi{mi  —  1).  This  boimd  can 
tightened  if  we  know  that  di  €  5(t).  Then  sinc>>  '  know  that  di  has  zero  outdegree,  and 
zero  indegree  after  the  t**  iteration,  we  have  the  bound  (m<  -  l)mi+i. 


By  definition  of  Ei  we  conclude  thst; 


Ei  >  Ei-i  -  m?  -  m»(m»  -  1) 
Ei>  Eo-m\-  (mi  -  l)mj. 


Ei-  =  0  by  definition  of  d*.  Thus, 

i*  -  . 

0  =  Ed‘  >  -  mf  +  (mi  -  l)mi  +  ^  rrii^mi  -  1)  j  +  Eq. 


d‘-i 


i=l 


Solving  for  Eq  : 


>=3 


Eo  <  ^  mj  +  (mi  -  l)mj  +  ^  m<(m<  -  1). 


(2.14) 


t=i 


•=3 


Now  recall  from  lemma  2.2.  that  ^mi  =  N  and  m»  >  m^  - 1  >  1.  Eq  can  be  upperbounded 
by  maximizing  the  RHS  of  (2.14)  with  respect  to  the  mi’s  subject  to  the  constraints  just 
mentioned.  We  claim  that  this  maximum  occurs  when 


mi  =  iV  -  d*  +  1,  mj  =  ms  =  . . .  m^-  =  1. 

This  is  easily  seen  to  be  true  by  contradiction.  Suppose  the  maximum  is  achieved  so  that 
the  highest  order  m^  which  is  greater  than  1  is  not  mi,  say  m,,j  >  i.  Now  reduce  m,  to 
1  and  add  rrij  +  1  to  mi.  (We  can  do  this  because  none  of  the  constranints  are  violated.) 
Then  the  difference  in  the  RHS  is: 

S  =  rrij  ^2mi  +  mj  -  m,-  ~  l)  “  ("*1  +  ”*3) 

=  m,  ^mi  +  ms  +  mi  -  m,-  ~  l)  “  (”*1  +  "*3) 

Now  observe  that  mi  >  my  >  2. 

=>  5  >  2^mi  +  ms  —  1  j  -  ^mi  +  ms^ 

^  6  ^  mi  +  ms  —  2  >  0, 
which  contradicts  the  assumption. 

Substituting  the  maximum  values  in  the  RHS  of  (2.14)  we  have: 

£?o  =  2A/  +  W  <  (W  -  d*  +  1)*  +  d*  -  1  +  (W  -  d*), 


q.E.D. 


The  reader  might  question  here  the  use  of  analyzing  this  centralized,  sequential 
algorithm  in  such  detail,  when  what  is  desired  is  a  distributed  procedure  with  a  high  degree 
of  parallelism.  In  response  we  note  that  a  distributed  version  of  Greedy  is  presented  in 
Chapter  3,  and  claims  on  its  ability  to  minimize  the  number  of  clusters  will  be  based  on 
the  analysis  just  completed. 

2.7.  Other  Approximate  Algorithms 

In  this  section  it  is  shown  that  other  reasonable,  but  more  complicated  heuristics 
fail  to  do  better  than  Greedy  in  the  worst  case. 

From  the  construction  of  one  may  be  tempted  to  suggest  a  greedy  algorithm 
which  nins  Greedy  N  times,  each  time  with  a  different  value  of  di.  The  output  DS  would 
be  the  minimum  cardinality  set  over  all  repetitions  of  Greedy.  Call  this  algorithm  Allgreedy 
and  let  the  cardinality  of  its  output  be  D*.  Allgreedy  performs  at  least  as  well  as  Greedy, 
and  also  solves  B^.n  exactly.  However,  consider  the  graph  B||^„,  which  consists  of  m  copies 
of  Bk,n.  (Fig.  2.8) 

It  is  easy  to  see  that  Allgreedy  will  find  the  optimum  in  only  one  copy  of  Bk,  and 
will  perform  in  the  worst  case,  as  poorly  as  Greedy  in  the  remaining  m  —  1  copies.  For  large 
m  Allgreedy  will  do  neglibly  better  than  Greedy,  in  fact  the  fractional  error  of  Allgreedy 
must  approach  that  of  Greedy  from  above,  as  m  increases.  Thus  even  for  simple  examples 
Allgreedy  is  net  very  effective. 

Finally,  we  examine  an  algorithm  which  is  not  greedy,  i.e.  nodes  selected  in  one 
step  might  be  dropped  in  subsequent  ones.  The  idea  behind  this  algorithm  is  to  msdntmn 
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iV  candidate  solutions,  such  that  the  one  includes  node  t.  At  each  iteration  the  t^h 
candidate  solution  is  obtained  by  choosing  t,  and  the  candidate  solution  {rom  the  previous 
iteration  which  along  with  i  covers  the  maximum  number  of  nodes  in  G.  The  algorithm  is 
presented  below: 

Heuristic  Comm{G{y,E)) 

Step  0:  Si(.-)  =  N{{),  V»,  fc  =  1,  =  {i}. 

Step  1:  fc  =  fc  +  1  ij*  €  orjmax,€v{li^{*)  U  ■Sfc-iy)]}  Di  =  {t}  U  Djj  Vt. 

Step  2:  S*(0  =  fr{i)  U  S„.iU:) 

Step  3:  If  maxvi  |^b(OI  —  1^1  STOP,  and  return  the  corresponding  set  D]  else  go  to 
Step  1. 

Initally,  the  candidate  solution  is  just  t.  In  the  second  iteration  (lc=2), 
corresponds  to  the  node  which  covers  the  most  number  of  uncovered  nodes  if  the  only  node 
picked  thus  far  were  t.  Thus  Di  =  ^  second  iteration.  Si{i)  is  nothing  but  the 

set  of  nodes  covered  by  Di  i.e.  N{i)u  At  iteration  k,  the  algorithm  has  N  candidate 

solutions,  Di,...Dft,  each  of  cardinality  at  most  k.  The  Sk'a  represent  the  nodes  covered 


by  these  candidate  solutions.  The  algorithm  terminates  when  one  of  the  solutions  covers 
all  the  nodes  in  the  graph.  The  5^4.1 ’s  are  obtained  as  follows: 

The  solution  set  is  node  1  combined  with  the  solution  set  obtained  in  the  iteration, 
such  that  the  combined  set  covers  the  moat  number  of  nodes. 

Observe  that  both  and  are  solved  exactly  by  Comm,  for  all  k,  n,  and  m. 
The  reader  is  encouraged  to  verify  this  claim.  While  we  cannot  show  worst  case  error,  we 
exhibit  the  following  class  of  graphs  Rn,  for  which  the  error  is  potentially  arbitrarily  large. 

iZnC^ni Let  l^nl  =  2"  and  label  the  nodes  1 ... 2**.  Let  i  base  2  be  the  n  digit  binary 
representation  of  t  <  2**:  Now  define  the  adjacencies  as  follows: 

N(l)  =  {  i:i  base  2  has  a  1  in  the  first  positon  } 

N(2)  =  {  i:  i  base  2  has  a  1  in  the  second  position  } 
etc.  until 

N(n-2)  =  {  i:  i  base  2  has  a  i  in  the  n  —  2’*'*  position  } 

N(n-l)  =  {  i:  i  base  2  has  01  in  the  last  2  positions  } 

N(n)  =  {  i:  i  base  2  has  10  in  the  last  2  postions  } 

N(n+1)  =  {  i:  i  base  2  has  00  in  the  last  2  positions  } 

N(n-h2)  =  {  i:  i  base  2  has  11  in  the  last  2  positions  } 

Observe  that  the  optimal  DS  =  {n  —  l,n,n  +  l,n  +  2}.  However,  Comm  picks  the  set 
{1,2, ...,n  +  2}.  To  see  why  this  happens,  first  observe  that  we  need  to  focus  attention 
only  on  the  nodes  1, . . . ,  n+2  since  these  nodes  have  much  larger  neighborhoods  than  any  of 
the  other  nodes  for  moderate  values  of  n.  In  particular,  nodes  1, . . . ,  n  -  2  have  degree 
and  nodes  n  -  l,...,n  +  2  have  degree  2’*”*.  It  is  convenient  to  call  A=.{l,2,...n-2} 
and  i4'  =  {n  -  1,  n  —  2,  n,  n  -h  1}.  Now  observe  that: 


|iV(A-)uN(i)|  =  |lV(Jif)|+2"-*-0.25|N(fir)|  =  0.75|N(iif)|+2'*-*,  K  C  AUA\  i  e  A'-K 

(2.15) 

|iV(/f)uW(»|  =  I JV(A-)|-)-2'*-^ -0.50|iNr(^r)|  =  0.50|MA')1+2"-S  K  C  A\jA',  ieA'-K 


We  see  that 


0.75lJV(ir)l  +  2"“*  <  0.50lJV(ir)|  +  2"-^  =►  I jV(Ji:)|  <  2".  (2.17) 

From  (2.17)  it  follows  that  for  the  first  n  -  2  iterations  all  of  the  candidate  solutions 
have  either  no  elements  which  belong  to  A\  or  at  most  one  element.  In  iteration  n  -  1, 
=  A  U  {j},  j  €  A',  Vi.  The  remaining  three  iterations  proceed  by  picking  3  nodes  in 
A'  so  that  by  the  end  of  the  n  +  2*^  iteration,  =  A  U  A'  Vi.  Comm  is  off  by  exactly 
n  -  2,  implying  that  the  error  can  become  arbitrarily  large,  and  that  it  varies  roughly  as  In 
N.  Thus  Comm  is  not  a  favorable  alternative  to  Greedy. 

Observe  that  the  fractional  errors  of  AUgreedy  and  Comm  are  the  same  for  the  class 
of  graphs,  Rn- 


2.8.  Summary 

In  this  chapter  the  complexities  of  minimizing  the  number  of  clusters  and  gateway 
nodes  in  a  PRN  were  examined  from  a  centralized  standpoint.  It  is  extremely  unlikely  that 
an  algorithm  exists  which  could  come  to  even  a  constant  of  the  optimal  solution  and  still 
run  in  polynomial  time.  However,  there  are  algorithms  for  DSP,  with  errors  that  grow 
extremely  slowly  with  the  number  of  nodes.  We  have  given  best  possible  bounds  for  the 
performance  of  some  such  heuristics,  and  doubt  seriously  the  existence  of  a  heuristic  which 
performs  significantly  better  in  the  worst  case. 

No  results  have  been  obtained  for  heuristics  for  CDSP,  and  SCDSP.  However, 
minimizing  the  number  of  clusters  certainly  has  a  diminishing  effect  on  the  number  of 
nodes  in  the  backbone  network.  This  cuts  down,  to  some  extent,  the  amount  of  commtini* 
cation  necessary  to  deliver  inter>cluster  messages.  In  the  next  chapter,  emphasis  will  be  on 
minimizing  the  communication  complexity  and  the  number  of  gateway  nodes,  within  the 
framework  of  distributed  algorithms,  based  on  heuristics  whose  computational  complexity 
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graph  C{V,E),  where 


^ = {{».y) :  («.y).  (y,o  e 


Let  there  be  M  undirected  edges  in  G.  It  is  assumed  that  G  is  connected. 

(b)  Nodes  must  send  messages  on  all  the  directed  (and  therefore  undirected)  edges 
emanating  from  them.  This  is  to  capture  the  broadcast  nature  of  the  network. 

(c)  Errors  in  transmission  are  not  allowed  i.e.  there  are  underlying  acknowledgement 

schemes.  Other  lower  level  protocols  such  as  error  detection  and  message  formatting 
are  also  assumed.  * 

(d)  Messages  arriving  at  a  node  are  queued,  and  then  processed  on  a  first  come  first 
serve  basis. 

(e)  The  delay  across  any  edge  is  finite  but  variable.  Messages  are  assumed  to  be  received 
along  a  direction  of  an  edge  in  the  order  that  they  were  transmitted. 

(f)  Interference  is  ignored  and  is  to  be  dealt  with  later. 

This  model  is  a  variation  of  the  one  presented  in  [14]  which  deals  strictly  with  point 
to  point  networks.  Also,  observe  that  (a)  is  a  “snapshot”  representation  of  the  mobile 
network. 

3.2.  Perforxnaxice  Measures  for  Distributed  Algorithms 

Two  measures  of  performance  have  traditionally  been  considered  for  distributed 
algorithms,  both  predicated  on  the  notion  of  an  elemental  message.  An  elemental  message 
is  an  integer  representing  a  node  identity,  the  cardinality  of  a  set,  or  some  other  such 
quantity.  The  first  measure  of  performance  is  the  Communication  Complexity,  C,  and  is  an 
asymptotic  bound  on  the  number  of  messages  sent  in  the  worst  case,  as  a  function  of  the 


problem  size.  In  a  network  the  problem  size  is  usually  the  number  of  nodes,  or  the  number 
of  nodes  plus  the  number  of  edges  M. 

The  Time  Complexity  7,  is  a  measure  of  the  total  amount  of  time  it  takes  to  run 
the  algorithm  in  the  worst  case.  Most  of  this  time  is  taken  by  communication  among 
nodes,  since  typically,  many  computation  steps  can  be  carried  out  in  the  time  that  it  takes 
a  message  to  be  transmitted  and  received.  So  unless  the  algorithm  is  highly  inefficient 
computationally,  it  is  reasonable  to  assume  that  computation  time  is  negligible.  Observe, 
that  if  the  algorithm  takes  exponential  time,  the  assumption  of  zero  computation  time  falls 
through.  Hence,  Time  complexity  is  measured  as  the  asymptotic  bound  on  the  number  of 
units  of  time  required,  if  the  communication  of  an  elemental  message  accross  an  edge  tadces 
one  unit  of  time. 

It  is  important  to  realize  that  these  meastires  do  have  their  limitations.  Firstly,  they 
are  asymptotic,  and  do  not  account  for  possibly  large  multiplicative  and  additive  constant 
factors.  Since  most  mobile  PRN’s  are  not  very  large,  asymptotic  results  can  at  best  be 
viewed  as  crude  measures  to  compare  different  algorithms  for  the  same  problem.  For  this 
reason,  the  actual  number  of  elemental  message  units  will  be  presented  with  any  results  for 
C  or  7.  Secondly,  the  results  are  worst-case,  and  therefore  not  necessarily  indicative  of 
typical  performance.  Lastly,  for  most  packet  networks  there  is  a  large  overhead  in  sending 
many  short  messages,  as  against  sending  a  few  longer  ones.  This  is  primarily  due  to  the 
fact  that  much  header  information  goes  into  every  packet,  and  our  definition  of  elemental 
message  does  not  include  this.  To  partially  coimter  this  effect,  we  will  also  give  the  total 
number  of  broadcasts  required,  with  any  results  for  C. 

Since  there  are  two  performance  criteria  for  distributed  algorithms  something  must 
be  said  about  the  tradeoffs  between  them.  In  commercial  wide  area  point-point  networks,  C 
has  an  impact  on  the  availability  of  the  network  for  other  functions,  whereas  7  has  an  impact 
on  the  grade  of  service  offered  to  the  users.  In  the  teictical  environment,  these  impacts  can  be 


viewed  as  the  level  of  security  (i.e.  likelihood  of  enemy  interception)  versus  the  timeliness  of 
decision  making.  In  most  networks,  the  amount  of  communication  required  is  more  sensitive 
to  fluctuations  in  dollar  terms,  than  is  the  time  taken  to  run  the  algorithm,  and  so  C  is 
generally  considered  to  be  more  crucial.  Also,  C  always  upperbounds  T.  However,  in  some 
pen’s  the  value  of  timely  service  may  be  much  more  than  that  of  less  communication,  and 
so  there  is  no  application  independent  manner  by  which  the  tradeoffs  of  these  measures  can 
be  examined. 

3.3.  The  Distributed  Greedy  (DISTG)  Algorithm 

The  Distributed  Greedy  Algorithm  finds  a  Dominating  Set  for  the  underlying  graph 
Gy  without  taking  into  account  the  effects  of  interference.  Recall  that  G  consists  of  undi* 
rected  edges  which  represent  the  bidirectional  connectivities  between  node-pairs.  The  reason 
for  not  including  the  other  edges  is  that  we  would  like  eventually  to  extend  this  algorithm 
to  a  truly  broadcast  algorithm  called  the  Greedy  Linked-Cluster  Algorithm  [GLCA)y  in 
which  it  is  assumed  that  no  acknowledgement  schemes  are  present  as  lower  level  protocols. 
If  the  connection  is  not  bidirectional,  then  messages  cannot  be  acknowledged  in  a  direct 
manner,  and  the  problem  becomes  extremely  complicated.  DISTG  contains  virtually  all  of 
the  attributes  which  will  enable  us  to  claim  that  GLCA,  the  extended  algorithm,  minimizes 
clusters  as  well  as  Greedy  does.  Also,  the  correctness  of  GLCA  follows  straightforwardly 
from  that  of  DISTG. 

DISTG  is  entirely  event-driven  in  its  communication-there  are  no  time  outs.  This 
is  desirable  since  it  eliminates  the  need  for  accurate  timers,  but  more  importantly  because 
it  limits  the  amoimt  of  communication.  The  algorithm  is  conducive  to  a  high  degree  of 
paraUelism-  different  parts  of  the  network  may  be  in  entirely  different  stages,  and  the 
nodes  terminate  based  on  events  which  have  occmed,  not  on  some  period  of  time.  This 
clearly  plays  a  role  in  keeping  T  down  as  well.  A  convenient  assumption  is  that  every  node 


knows  the  (bidirectional)  connectivities  of  itself,  and  those  of  its  neighbors  t  ti  e  start  of 
the  algorithm.  This  can  be  achieved  easily  by  a  2  stage  flooding  scheme.  [5] 

The  idea  of  the  algorithm  is  the  following:  Suppose  a  node,  t,  covers  the  largest 
number  of  uncovered  nodes  in  its  second  neighborhood.  This  means  that  if  at  that  point  in 
time,  any  of  I’s  neighbors,  say  j,  is  to  be  covered,  then  t  would  be  that  node  in  the  graph 
which  would  cover  the  largest  number  of  nodes  and  which  also  covers  j.  So  it  is  reasonable 
to  assert  that  a  node,  with  only  very  local  knowledge  of  the  network  topology  would  like  to 
elect  as  cluster  leader  its  neighbor  which  covers  the  largest  number  of  uncovered  nodes.  If 
all  nodes  can  be  dominated  by  their  most  “highly  connected”  neighbors,  a  reasonably  small 
number  of  clusters  can  be  expected.  DISTG  fulfils  this  goal,  as  we  will  see. 

Data  Structures  at  node  i: 

Connectivity:  a  list  of  edges  which  describe  JV(J?(»)). 

Status,  a  variable  which  takes  values  from  {FREE,  HEAD,  TAKEN},  depending  on  whether 
t  is  uncovered,  a  leader,  or  covered  but  not  a  leader. 

Nstatus:  a  list  indexed  such  that  Nstatus(j)  is  the  status  of  j  €  j^{i). 

Uylist  a  list  of  tuples  such  that  (o,  4)  €  Uplist  iff  necessary  changes  in  information  on  node 
6  which  are  caused  by  the  declaration  of  a  as  a  leader,  have  been  made  at  node  t. 
UpdateStatusfa}:  a  boolean  variable  which  is  either  WAIT  or  NO  WAIT.  It  is  WAIT  if  there 
is  at  least  one  node  in  N(i)  from  which  new  information  is  likely  to  be  received  because  of 
the  declaration  of  node  a  as  a  cluster  head. 

k^:  the  lowest  indexed  node  of  iV’(i)  which  has  a  maximum  cardinality  set  of  FREE  neigh¬ 
bors.  This  is  clearly  the  most  highly  connected  neighbor  of  the  node  as  discussed  above. 
Kstatus  a  list  of  the  k’  values  of  all  the  node’s  FREE  neighbors. 

Fla£.  a  variable  which  is  set  either  UP  or  DOWN.  It  is  UP  iff  UpdateStatus(a)=WAIT 
for  some  node  o,  or  if  no  information  has  been  received  from  the  start  of  the  algorithm  on 
k‘  for  some  neighbor  of  i.  One  can  view  Flag  as  a  necessary  check  on  extreme  degrees  of 
parallelism  among  neighboring  nodes,  which  could  result  in  outdated  events  being  acted 


upon.  When  Flag  is  set  to  UP,  one  must  interpret  this  as  a  step  in  the  algorithm  when  the 
node  wants  to  update  its  information  on  the  k*  values  of  its  neighbors. 

SSent  a  list  of  boolean  variables,  each  set  to  TRUE  or  FALSE.  3Sent(l)=TRUE  means  that 
i  is  3  hops  away  &om  a  cluster  head,  I,  and  that  it  has  broadcast  the  necessary  information 
pertaining  to  this  fact. 

Leader,  the  identity  of  the  cluster  leader  of  i. 

The  algorithm  is  now  presented.  We  focus  on  how  it  works  at  a  node,  t.  The 
initialization  is  as  follows: 

Flag=UP  since  the  node  has  no  information  on  the  k*  values  of  its  neighbors. 
3Sent(l)=FALSE  V/; 

Leader^  t; 

UpdateStatua(a)=NOWAIT  for  all  a  €  N{N{t))\ 

Natatus(i)=FREE  Vj  €  R{i)-, 

Uplist,  Kstatua=^; 

Send(k;); 

As  the  values  of  are  received  from  neighbors,  Kstatus  is  updated.  Once  values  of 
k*  have  been  received  from  all  neighbors,  the  procedure  TryHead  listed  below,  is  executed. 
At  this  point  t  checks  to  see  if  it  is  the  choice  of  all  of  its  neighbors.  If  so,  it  declares  itself  a 
cluster  leader,  changes  its  status  to  HEAD,  and  terminates  the  algorithm  after  broadcasting 
the  change  of  status  to  its  neighbors.  It  also  broadcasts  N(i);  we  will  see  why  this  is  done 
later.  TryHead  is  executed,  as  shown  later,  in  response  to  various  messages. 

Event  IkyHead: 
begin 

If  (Flag=DOWN)  AND  {Kstatus{j)  =  i  Vj  €Kstatus)  then  (*is  i  ripe?  * 


[  typeO  message] 


end 


end 


Leader=i; 

Nstatiis(i)  :=  HEAD; 
5en4i  is  a  head,N(i}): 
TERMINATE; 


Let  us  now  consider  the  response  of  t  to  the  message  that  one  of  its  neighbors,  j,  is 
a  head.  First  it  records  the  information  that  j  is  a  head  in  Nstatus.  Ay  (see  code)  is  the  set 
of  nodes  in  N(i)  which  are  affected  by  y’s  declaration.  These  sets  enable  Nstattis,  Kstatus, 
and  Uplist  to  be  modified  to  record  all  changes  in  N(i)  due  to  y’s  declaration.  After  this, 
t  must  pass  on  the  new  state  to  its  neighbors  if  in  fact  there  are  neighbors  which  do  not 
know  that  y  is  a  cluster  head,  and  which  have  not  terminated.  This  is  necessary  because: 

(a)  If  i  has  changed  its  stattis,  its  neighbors  must  know  that  if  they  were  to  declare 
themselves  cluster  heads,  they  would  cover  at  least  one  less  node.  Also,  they  must 
now  disregard  the  value  of  . 

(b)  Suppose  k*  =  t  for  some  neighbor  p.  It  is  clear  that  this  value  can  change,  once 
the  number  of  FREE  nodes  covered  by  i  does.  Assume  that  the  value  changes  to 
9,  where  9  is  a  node  which  would  declare  itself  to  be  cluster  leader,  if  only  its  &ee 
neighbor  p,  would  change  its  value  of  k*  to  9.  Then  we  see  that  in  order  for  9  to  be 
able  to  change  its  status,  i  must  commimicate  the  new  status  of  the  node  t,  along 
with  the  information  for  p  to  calculate  the  size  of  t’s  new  FREE  neighborhood. 

Hence,  t  broadcasts  the  fact  that  it  is  TAKEN  by  j,  and  N(j).  Note  that  if  there 
are  no  FREE  nodes  in  N(t),  i  will  broadcast  the  information  and  then  terminate.  This 
situation  is  illustrated  by  figure  3.1,  and  the  code  presented  below: 

Event  DoTaken:  executed  when  iZec(TypeO  from  j): 


After  3  declares  itself  a  HEAD,  aodes  7  and  9  will  prefer  8  instead  of  10. 


If  (j,i)  ^  Uplist  then 
begin 

If  Leadersi  then  Leaders:];  (’assign  t  to  leader  j*) 

(*  Find  I’a  affected  neighbors  *) 

Aj  =  S’{i)  n  :  ife  €  N{j)  and  Nstatus(k)^HEAD}; 

NstatnsCj)  =  HEAD; 

Delete  Kstatna^);  {*j  is  not  FREE  anymore  *) 

For  all  p  €  If{j)  s.t.  Nstatu3(p);^  HEAD  do 
Nstatus(p)=TAKEN ; 

(*  Modify  the  datastructures  *) 

For  all  h  €  Aj  do 
begin 

Delete  Kstatus(k); 

UpIist=Upli3tu{(j, k)};  (’Since  i  €  A,,  (/,  i)  eUplist  *) 

end; 

(’  Any  nodes  in  N(i)  which  do  not  know  that  j  is  a  HEAD  are  told  so.  *) 


If  3i:  €  N{t)  —  such  chat  Q,k)^  Uplist  then 
begin 

Updates  tatus  (j ) = WAIT ; 

Send(i  takes  by  j,  N0});  [typel  meaaagej 

end 

else 


begin 

UpdateStatu30')=NOWAIT; 

If  Kstatus(m)  =  i  for  some  m  €  Kstatxis  then 

Tty  Head;  (‘Since  Flag  may  be  down  *) 

end; 

If  {fc :  i  €  i^(0,  Natatus(k)=FREE}  =  ^  then  TERMINATE; 

end 

end 


Observe  that  if  t  broadcasts,  it  also  sets  UpdateStatus0)=WAIT.  The  reasons  for 
doing  this  are  best  illustrated  by  the  simple  example  of  5  nodes  connected  in  a  line.  (Fig. 
3.2.)  Another  thing  to  note  is  that  i  does  not  send  a  value  of  k*  since  it  is  now  TAKEN. 


It  can  easily  be  seen  that  the  only  node  that  can  declare  itself  to  be  a  HEAD,  after 
Flag=DOWN  for  the  first  time,  is  2.  This  node  sends  out  the  information  that  it  is  a  leader 
to  nodes  1  and  3.  Node  3  receives  this,  and  in  executing  Event  DoTaken,  sends  out  its 
broadcast.  Now  suppose  UpdateStatusQ)  is  not  set  to  WAIT.  Then  it  is  possible  for  Flag 
to  be  DOWN.  Observe  that  the  most  current  value  of  JIC4  3  possesses  is  3.  Its  only  other 
neighbor  is  not  free,  and  so  event  Trykead  has  occured.  Hence  3  becomes  a  cluster  head. 
Similarly,  4  and  5  will  also  become  cluster  leaders.  By  ensuring  that  Flag=UP,  we  force  3 
to  wait  until  node  4  has  had  a  chance  to  reevaluate  its  value  of  k‘,  which  is  clearly  4  itself. 

We  now  move  to  the  case  when  t  receives  a  message  sent  by  a  node  j,  whose  neighbor 
/,  has  just  declared  itself  a  cluster  leader.  If  i  is  just  one  hop  away  from  /  then  it  will  execute 
DoTaken  subsequently,  if  it  has  not  done  so  already.  Otherwise  t  must  be  2  hops  away  &om 
/.  C|  is  the  set  of  nodes  in  N(i)  which  are  affected  by  f’s  declaration.  When  t  receives 
a  typel  message  pertaining  to  a  leader  t  for  the  first  time,  it  has  enough  information  to 
modify  all  its  variables  correctly:  All  t’s  non-head  neighbors,  adjacent  to  /  are  in  the  set  Cj 
which  is  determined  as  shown  in  the  code.  Hence  (l,i)  is  added  to  Uplist,  and  all  subsequent 
typel  messages  pertaining  to  /  are  ignored. 

Again,  t  sets  UpdateStatus=  WAIT,  but  only  if  there  is  a  chance  of  it  declaring  itself 
a  clxister  leader  incorrectly.  Fig.  3.3.  shows  this  condition,  and  also  illustrates  some  of  the 
points  made  earlier: 

Event  TwoAway:  executed  when  Xee(Typel  message  from  j;  1  is  leader) 
begin 

(*  Continue  only  if  i  is  2  hops  away,  and  has  not  heard  about  j’s  declaration  *) 

If  (/,t)  ^Uplist  and  /  ^  N(()  then 
begin 


Uplist=Uplistu{  (IJ)}; 

Q  =  {iV(i')  n  N(l)  n  {Jk  :Nstatus(k)#HEAD}}; 
Nstatus(l)=HEAD; 
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Fig.  3.3 

(*Modify  datastructures*) 

Tor  all  ib  €  C{  do 
begin 

Delete  K5tatus(k); 

Uplist  =Uplistu{(/,Jk)}; 

end; 

For  all  p  €  N{1)  n  N{S{i))  s.t.  Nstatus^HEAD  do 
Nstat\ia(p)=TAKEN ; 

If  N9tatus(i)=FR££  then  Recalculate  k^; 

Sead(l,N(I),  and  [*,*  if  Nstatu3(i)=FREE]);  [type2  message] 
Uplist=Uplisttj{(/,0}; 

If  ip  :  {l,p)  ^  Uplist,  Kstatus(p)=i  then 
UpdateStatus(l)=WAn 


begin 


Updates  tatus(l)=NOWAlT; 
T^Head; 


JI{k:k€  N{i),  Nstatus(k)=FREE}  =  4>  then  TERMINATE; 

end 

end 

Node  1  has  declared  itself  a  leader,  and  node  5  has  received  a  message  from,  say 
node  3,  intimating  it  of  this  fact.  Now  5  has  only  2  FREE  neighbors  left  (6  and  7),  both  of 
which  had  initially  chosen  5  to  be  their  most  highly  connected  neighbor.  If  UpdateStatus(l] 
were  not  set  to  WAIT,  then  it  is  possible  that  Flag=DOWN,  and  5  would  declare  itself  to 
be  a  leader,  which  it  clearly  should  not  be  able  to  do.  When  Flag=UP,  5  miist  wait  to  be 
informed  that  7  is  now  the  most  attractive  candidate. 

However,  if  t  is  none  of  its  FREE  neighbors*  most  highly  connected  neighbor,  it  is  not 
necessary  to  change  UpdateStatus(l)  to  WATT.  If  t  is  FREE,  the  new  value  of  k^  must  be 
broadcast.  This  is  calculated  easily  &om  Connectivity  and  Nstatus. 

We  now  consider  t’s  reponse  to  a  type2  message.  There  are  3  possibilities:  t  is 
either  1,  2,  or  3  hops  away.  In  the  hrst  and  second  cases,  no  communication  is  necessary, 
but  I  must  update  its  variables.  This  is  done  as  shown  in  the  code.  It  is  possible  that 
UpdateStatus=NOWAIT,  and  t  may  execute  TryHead  subsequently. 

If  i  is  three  hops  away  from  I  it  must  send  an  updated  value  of  k^ .  Again,  t  has 
all  the  information  necessary  to  do  this  the  very  first  time  it  receives  a  type2  message 
pertaining  to  /:  Ei  is  the  set  of  neighbors  of  t  which  are  two  hops  away  from  I,  and  not 
HEADs.  At  this  point  i  can  find  its  most  highly  connected  neighbor  from  Connectivity  and 
Nstatus.  Hence,  A;,*  can  be  broadcast  without  waiting  to  hear  from  the  other  nodes,  if  in 
fact  Nstatus(i)=FREE.  To  prevent  t  &om  rebroadcasting  another  type3  message  pertaining 
to  I,  the  boolean  3Sent  is  set  to  TRUE. 

Observe  that  while  t  cannot  calculate  the  new  values  of  k*  of  all  members  of  Ei,  it 
need  not  wait  to  hear  &om  all  of  them  before  executing  TryHead.  This  is  because  all  of  t’s 


aeighbors  that  preferred  it  before  /’a  declaratioa  will  continue  to  do  so  after  the  declaration. 
(See  Sg.3.4.  for  an  illustration  of  this  point.) 


Node  1  dedares  itself  a  ElEAO.  Suppose  node  6  receives  a  type2  message  firom  node  3,  but 
has  not  heard  from  node  5.  In  3.4  (a),  Kstatus(5)  ^  6  at  node  6,  and  so  6  will  not  declare 
itself  head  before  hearing  from  node  5.  In  3.4(b]  Kstatus(6)s5  before  6  has  heard  firom  5, 
and  this  value  will  not  change  even  after  it  has. 

Event  OneTwoThreeAway:  executed  when  iZee(TVpe2  &om  j;  1  is  leader) 
begin 

If  (I  €  IV’(i))  OR  (/  €  y(y(i)))  then  (“Le.  if «  is  1  or  2  hops  away  from  /  *) 
begin 

If  Nstatus(j)=FREE  then 

Replace  the  value  of  kj  in  Kstatus; 

Uplist=Uplist  u{(l,i)}; 

If  €  lV(i)  -  {/}  s.t.  (I^)^  Uplist  then 
begin 

UpdateStatus(l)«NOWAlT; 

TVyHead; 


else  begin  (*t  is  3  hops  sway  *) 

Ei^  {k  :k€  N{t),  Nstatus(i)7^H£AD  s.t.  3p  €  {N{k)  fl  N{1)} 

AND  Nstatus(p)?^  HEAD}; 
If  Nstatus(i)=FR£E  then  Recalculate  k^-, 

For  all  p  €  ^(/)  n  N{N{t))  s.t.  Nstatus#HEAD  do 
Nstatus(p)=TAKEN ; 

If  not  3Sent(l)  then  (*Only  one  broadcast  per  head  *) 
begin 

Send(/,  and  [Jb,*  if  Nstatus(i)=FR£E)];  [typeS  message] 
3Sent(l)=TRUE 

end; 

DyHesd; 

end 

end. 


The  only  remaining  case  is  the  action  taken  on  receiving  a  typeS  message:  When 
this  happens  i  checks  if  it  is  2  hops  away  from  /.  If  it  is  not,  then  it  must  be  3  or  4  hops 
away  and  merely  updates  its  Kstatus.  If  it  is  2  hops  away,  it  replaces  the  new  value  of  k*, 
and  modifies  Uplist  appropriately.  It  then  checks  to  see  if  the  conditions  have  been  met  to 
CLfOSE  UpdateStatua(l). 

Event  TwoFonrHeceive:  executed  when  Rec(Type3  from  j;  1  is  leader); 
begin 

If  I  €  N(N(i))  then 
begin 


Upliat=Uplist  u{(/,y)}; 
Replace  k*  in  Kstatus; 


end 

else 


end. 


II  €  N{i)  such  that  {l,k)  ^  Uplist  then 
begin 

Updates  tatus(l)=NOWAIT; 

T^Head; 

end 

else 

UpdateStatus(l)=WAJT; 


Replace  in  Kstatus;  TbryHead; 


3.4.  Correctness  of  DISTG 

The  approach  taken  is  to  show  that  DISTG,  and  a  centralized  algorithm  Multi- 
greedy  produce  exactly  the  same  dominating  sets  for  all  graphs.  Multigreedy  is  sequential, 
but  may  pick  more  than  one  node  for  the  dominating  set  in  a  single  iteration. 

Multigreedy  converts  the  undirected  graph  G  to  a  directed  one.  This  is  done  by  replacing 
every  edge  in  G  by  2  directed  edges  going  in  opposite  directions.  In  addition  a  loop  is  added 
at  each  node.  Interpret  a  directed  edge,  (ij),  to  mean  that  if  i  were  to  be  chosen  at  that 
stage  of  the  algorithm,  then  j  would  be  covered.  Initially,  all  nodes  are  uncovered  so  if  a 
node  is  chosen  then  its  entire  closed  neighborhood  is  covered.  This  explains  step  0  (see 
code).  Following,  is  some  notation  that  helps  explain  operations  on  the  directed  graph: 
N'^[i)  =  the  outneighborhood  of  node  t. 

A  Ripe  node  is  defined  as  follows:  t  is  Ripe 

[\N*{i)\>\N-^U)\OR  \N^{i)\  =  \N*[3)\AND  i  <  j]  (Vj  €  (Ar+(0). 


Some  explanation  is  in  order,  i  is  ripe  if  and  only  if  it  is  the  most  “highly  connected  node’ 


of  all  its  FREE  neighbors.  The  free  neighbors  of  i  are  None  of  the  neighbors  of 

these  &ee  neighbors  of  i  should  have  a  greater  outdegree  than  i.  This  gives  the  expression. 


Algorithm  Mxiltigreedy:  Input:  An  undirected  graph,  G(V,E). 

(0)  Replace  every  edge  (ij)  in  E  by  2  directed  edges  (ij)  and  (j,i).  For  every  node  i, 

add  (i,i).  Set  D  =  <l>. 

(1)  K  =  {i  :  i  is  Ripe}. 

(2)  D  =  Du  K 

(3)  Delete  all  edges  coming  into  N{K). 

(4)  If  any  edges  remain,  goto  step  1;  else  STOP. 

Output:  a  dominating  set,  D. 

Once  a  node  t  is  chosen,  by  our  interpretation  of  directed  edge,  all  edges  incoming 
to  the  nodes  in  ff{i)  must  be  deleted.  This  is  done  in  step  3.  If  there  are  any  edges  left 
at  the  end  of  an  iteration,  there  are  still  some  uncovered  nodes  remaining,  and  so  step  1  is 
repeated. 

Observe  that  the  k*  values  in  DISTG  help  determine  which  nodes  are  Ripe  in  the 
network  a  particular  stage  of  the  algorithm.  The  deletion  of  an  edge  in  Multigreedy  is  much 
like  the  book>keeping  done  for  Nstatus  in  DISTG.  In  the  following  argument  we  will  verify 
that  only  Ripe  nodes  are  declared  cluster  leaders  by  Procedure  TryEead  in  DISTG. 

After  the  initial  values  of  k*  have  been  received  at  the  least  numbered  node  with 
the  highest  outdegree,  and  its  Flag=DOWN  for  the  first  time,  the  node  will  declare  itself 
to  be  a  cluster  head,  broadcast,  and  terminate.  This  ensures  that  the  algorithm  will  in  fact 
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begin.  Note  that  this  node  is  Ripe  in  the  first  iteration  of  Multigreedy. 


The  next  Lemma  will  be  useful  in  the  proof. 

Lemma  3.1.  If  2  nodes  declare  themselves  to  be  cluster  leaders  at  the  same  time,  they 
must  have  no  FREE  neighbors  in  common. 

Proof:  By  contradiction.  Let  nodes  p  and  q  declare  themselves  to  be  leaders 
simultaneously,  and  let  their  neighborhoods  have  some  FREE  node,  r  in  common.  Since 
both  nodes  consider  themselves  to  be  ripe,  r  must  have  considered  each  one  of  them  its 
most  highly  connected  neighbor  at  different  times.  Suppose,  without  loss  of  generality  that 
g  was  an  earlier  value  than  was  p.  When  k*  changed  to  p,  it  must  have  done  so  because  of 
a  decrease  in  |W'*'(9)|.  It  is  now  argued  that  r  could  not  have  learned  of  this  decrease  from 
q  itself. 

Suppose  q  knew  that  its  FREE  neighborhood  had  decreased,  before  its  declaration.  When 
q  broadcast  this  typel  or  type2  message,  it  had  to  have  set  UpdateStatus(t)=WAIT  for 
some  t,  implying  that  Flag  must  have  been  UP.  Thus  it  must  have  received  the  modified 
value  of  k*  before  setting  UpdateStatus(z)=NOWAIT,  i.e.  making  it  possible  for  Flag  to 
be  DOWN,  and  so  it  could  not  consider  itself  to  be  Ripe.  This  contradiction  shows  that  q 
could  not  have  known  of  the  decrease  in  its  FREE  neighborhood  before  declaring  itself  a 
cluster  head,  implying  that  r  could  not  have  heard  of  the  decrease  from  g. 

So  suppose  r  heard  from  some  u  ^  q,  and  let  t  be  the  node  whose  declaration 
resulted  in  the  decrease  in  |N'*'(9)|.  It  is  clear  that  q  smd  t  must  have  at  leeist  one  node, 
g  in  the  intersection  of  their  closed  neighborhoods  which  was  FREE  before  t’s  (and  q’s) 
declaration.  Suppose  g  =  q.  Then  k*  =  t,  and  q  could  not  have  declared  itself  a  HEAD 
without  hearing  about  t’s  declaration.  If  ^  =  t  then  Kstatus(t)=q  at  q,  implying  that  t 
could  not  have  declared  itself  HEAD.  So  g  is  some  node  other  than  q  and  t.  Now  since  t’s 
declaration  preceeds  q’s,  it  follows  that  when  g  was  FREE,  k*  =  t.  Then  Nstatus(g)=t  at  q 
when  it  declares  itself  HEAD  because  q  does  not  know  of  t’s  declaration.  This  contradiction 
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shows  that  g  does  not  exist,  implying  that  the  FREL  neighborhood  of  q  does  not  change 
by  t’s  declaration  ib*  does  not  change  ^  p  could  not  have  declared  itself  a  cluster  head 
until  it  learned  that  q  had. 

Q.E.D. 

Now  suppose  that  a  node  t  declares  itself  a  cluster  leader  at  some  time,  T  of  the 
algorithm.  There  are  2  possible  cases: 

Node  t  was  a  FREE  node  just  before  it  became  a  HEAD. 

Node  t  was  a  TAKEN  node  just  before  it  became  a  HEAD. 

Case  1  :  In  general  some  of  Vs  neighbors  are  free,  and  others  already  taken  when  t  becomes 
a  leader.  Note  that  none  of  the  neighbors  can  be  leaders,  since  i  is  FREE.  Now, 
suppose  that  N{i)  consists  of  no  TAKEN  nodes.  Then  the  value  of  |Af'*'(i)l  not 
changed  since  the  beginning  of  the  algorithm.  Since,  >  is  going  to  declare  itself  a 
HEAD,  all  of  its  neighbors  must  have  regarded  it  as  their  most  highly  coimected 
neighbor  at  some  earlier  time  in  the  algorithm.  Now  observe  that  values  |N'‘*‘(t)| 
never  increase,  implying  that  since  none  of  the  nodes  in  |A^(0I  TAKEN,  t 

still  has  to  be  the  most  highly  connected  node  of  aU  its  neighbors. 

But  suppose  that  some  of  Vs  neighbors  are  TAKEN  nodes.  Let  j  be  the  neighbor  to 
be  most  recently  taken  before  time  T.  Note  that  this  time  must  be  strictly  before 
T,  because  no  2  nodes  can  simultaneously  become  cluster  leaders  if  they  have  a 
Free  node  in  common.  The  Free  node  in  this  case  is  t.  Now  suppose  that  some 
FREE  neighbor  of  t,  say  p  does  not  have  k'  =  i  at  time  T.  Obviously,  t  does  not 
know  this  since  it  is  about  to  declare  itself  a  HEAD.  This  means  that  i  was  p’s 
most  highly  connected  neighbor  at  some  earlier  time.  This  situation  could  only 
have  changed  by  a  subsequent  decrease  in  the  value  of  The  last  time  this 

happened  was  when  j  was  taken.  At  this  time,  t  must  have  sent  a  type  2  message 
to  node  p,  and  set  its  Flag  to  UP.  At  time  T,  Vs  flag  is  DOWN,  which  means  that 
node  p  considered  i  to  be  its  most  highly  connected  neighbor  even  after  |iV'*'(t){ 
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had  reduced  to  its  value  at  time  T.  This  contradicts  our  assumption,  and  DISTG 
always  picks  a  Ripe  node  in  this  case. 


Case2  :  Since  t  is  taken  at  time  T,  at  least  one  of  its  neighbors  has  already  declared  itself  a 
cluster  head.  Let  j  be  the  neighbor  to  become  a  head  most  recently  before  T.  Before 
I  received  the  typeO  message  from  j  it  could  not  have  considered  itself  Ripe  because 
j,  its  neighbor  did,  and  we  know  that  2  neighbors  can  never  consider  themselves 
Ripe  simultaneously.  So  after  receiving  the  typeO  message  from  j,  i  set  its  Flag  to 
UP  and  broadcast  a  typel  message  to  all  its  neighbors.  All  its  free  neighbors  must 
have  declared  it  to  be  the  most  highly  connected  node,  and  therefore  we  know  that 
even  after  the  most  recent  reduction  in  due  to  a  typeO  message,  node  t  is 

the  moat  ripe  nodes.  However,  observe  that  |jV'*‘(t)|  could  also  have  been  reduced 
because  of  typel  messages  received.  That  this  cannot  lead  to  an  error  has  already 
been  shown  in  Case  1.  From  ''is  we  conclude  that  if  t  is  TAKEN  at  time  T,  it 
cannot  declare  itself  to  be  a  leader  unless  it  is  ripe. 

We  still  have  to  show  that  the  algorithm  never  terminates  before  covering  all  the 
nodes,  and  that  it  never  deadlocks.  The  second  of  these  issues  is  resolved  easily  in  light  of 
the  preceding  discussion.  DISTG  can  never  deadlock  because  there  are  always  Ripe  nodes 
in  the  graph,  until  all  nodes  are  covered.  This  follows  from  the  fact  that  Multigreedy  does 
not  deadlock.  Now  we  show  that  the  algorithm  will  never  terminate  if  there  is  even  a  single 
uncovered  node  in  the  network.  This  can  be  seen  from  the  stopping  conditions  at  a  node- 
either  the  node  is  a  cluster  leader,  or  it  has  no  FREE  neighbors.  An  uncovered  node  always 
has  status  FREE,  and  so  none  of  such  a  node’s  neighbors  can  terminate  until  it  has  been 
covered. 

This  completes  the  proof  of  correctness  for  DISTG. 
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3.5.  Equivalence  of  DISTG  and  Greedy 

We  have  shown  DISTG  and  Mvltigreedy  pick  identical  Dominating  seta;  hence  it  is 
sufficient  to  show  that  Multigreedy  and  Greedy  behave  identically,  in  order  to  establish  the 
equivalence  of  Greedy  and  DISTG.  For  any  graph  G,  with  some  order  of  numbering  on  its 
nodes,  both  Greedy  and  Multigreedy  have  unique  solutions.  Let  Ra,  and  Ta  respectively, 
be  these  solutions  for  an  undirected  graph  GfV,  E). 

Theorem  3.2.  VG,  Tq  =  Ra- 

Proof:  Let  Ra  =  where  is  the  node  picked  in  the  iteration. 

Define  to  be  the  set  of  uncovered  nodes  covered  by  gi  when  it  is  picked.  As  before  let 
|5<|  =  mi. 

Now  let  Tfc  =  {tj,  ...t*}  be  the  set  of  nodes  picked  in  the  iteration  of  Multigreedy. 
Observe  that  2  nodes  cannot  be  picked  in  the  same  iteration  of  Multigreedy  if  they  have  any 
FREE  nodes  in  the  intersection  of  their  neighborhoods.  Define  to  be  the  set  of  uncov¬ 
ered  nodes  covered  by  ti  when  it  is  picked.  Clearly  the  Ci’s  are  mutually  disjoint  Also,  let 
|Ci|  =  bi.  By  the  definition  of  the  algorithms,  we  know  that  ^  which  implies 

that  it  is  sufficient  to  show  that  U?=i  C  RoVfc.  We  show  the  following  by  induction  on 
k,  the  number  of  iterations: 

(0  ujLj  TiCRa 

(«)  {ji,--,y*}cujL,r< 

(«i)  gi  =  tj  =>  Si  =  Cj. 

(a)  Basts  A;  =  1:  Initially,  all  the  nodes  are  uncovered.  Multigreedy  selects  least  num¬ 
bered  nodes  which  have  the  maximum  sized  neighborhoods  in  their  own  second 
neighborhoods.  Ittvially,  gi  has  this  property,  i.e.  gi  €  Ti.  Now  suppose  that 
i  €  Ti  for  some  i  ^  gi.  At  some  iteration  of  Greedy  at  least  one  of  the  nodes  in 


N{i)  will  be  covered  for  the  first  time.  Let  j  be  such  a  node.  By  definition,  t  is 
the  least  numbered  node  in  the  graph  which  covers  the  lugest  number  of  nodes 
including  j,  so  it  follows  that  i  will  be  picked  by  Greedy.  Let  i  =  gv  Now  observe 
that  the  set  of  uncovered  nodes  covered  by  t  is  the  szune  when  it  is  picked  by  either 
algorithm,  i.e.  =  5«,  =  ^(0- 

(b)  Inductive  Step:  k  =  K  +  1:  By  assumption  Multigreedy  has  picked  gi,...  ,gk,  and 
possibly  some  more  nodes  in  Ra-  Also  if  gi  =  tj  ^  Si  =  Cj  for  all  ty’s  picked  in 
previous  iterations.  Now  let’s  look  at  gx+i-  If  it  has  adready  been  picked  in  previ¬ 
ous  iterations  of  Multigreedy,  then  condition  (ii)  of  the  hypothesis  is  met.  Suppose 
yx+i  was  not  picked  in  previous  iterations.  By  the  hypothesis  we  know  that  all  the 
nodes  in  Sk+i  must  be  uncovered  in  the  K + 1**  iteration  of  Multigreedy.  Then  since 
{ffii  •  •  •  >S/f}  C  Ui^i  it  follows  that  gx+i  is  Hipe,  and  therfore  that  Multigreedy 
picks  it  in  iteration  if  +  1.  Now  suppose  that  i  ^  ffx+i  is  Ripe  in  the  K  +  1'* 
iteration.  Consider  the  set  None  of  these  nodes  is  covered  in  the  first  k 

iterations  of  Greedy.  Thus  the  first  node,  j,  in  the  set  is  covered  in  Greedy  after 
iteration  K,  say  iteration  ui.  The  node  which  covers  it  in  Greedy,  g^,  must  be  Ripe 
at  this  stage.  Since  j  is  the  first  member  of  N'^[i)  to  be  covered,  it  follows  that  i 
must  be  the  highest  connected  neighbor  of  j  i.e.  g^  =  i.  Hence  we  have  shown  that 
all  three  conditions  of  the  hypothesis  hold. 


Q.E.D.I 


We  have  already  shown  that  DISTG  returns  the  same  dominating  set  as  Multi¬ 
greedy.  This,  with  the  above  result,  establishes  that  DISTG,  and  Greedy  minimize  the 
number  of  clusters  identically.  Based  on  the  work  in  Chapter  2,  it  can  thus  be  strongly 
argued  that  DISTG  minimizes  clusters  as  efficiently  as  could  be  hoped  for. 
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3.6.  Complexity  of  DISTG 


Initially,  the  list  Connectivity  must  be  established  at  each  node.  This  takes  2N 
broadcasts.  All  subsequent  communication  is  triggered  by  the  declaration  of  a  leader. 
The  number  of  broadcasts  resulting  from  node  t  becoming  a  leader  is  no  more  than  Fi  = 
|iV(iV(iV(t)))|.  So,  the  total  number  of  broadcasts  <  Fdt  +  2N 
Now  observe  that  the  number  of  elemental  messages  in  each  of  typeO,  typel  and  type2 
messages  sent  is  bounded  by  the  size  of  the  neighborhood  of  some  cluster  leader,  plus  2. 
Types  messages  consist  of  only  2  elemental  messages.  Let  S  be  the  size  of  the  largest 
neighborhood.  Also,  let  T.  =  We  have: 

d" 

C  <  ^  (2  +  Ii\r(d,)  1)  +  )2  +  2NS  (3.2) 

«sl 

(Recall  that  d*  is  the  number  of  nodes  picked  by  Greedy.  To  obtain  an  accurate  expression 
for  C  which  depends  only  on  M  and  N,  we  will  make  the  assumption  that  the  network  is 
regular  i.e.  aU  nodes  have  the  same  degree.  Then  liV(<i»)l  ss  3^  =  5,  Vt.  It  is  also 
that  £  >>  2,  and  that  communication  for  establishing  Connectivity  is  negligible.  So  that 
(3.2)  is  simplified  to 

C<d-{T^6^[F^-Ti,)). 

The  analysis  of  C  is  broken  into  3  cases,  depending  on  the  density  of  the  graph.  This  is 
because  (3.1)  gives  a  tight  botmd  for  dense  graphs,  whereas  the  actual  number  of  messages 
exchanged  pertaining  to  a  particular  cluster  head  varies  directly  with  the  density.  We  will 
also  make  use  of  the  bound  for  d*  derived  in  Theorem  2.6. 

Case  1:  6^  <  N:  It  is  easily  seen  that  7^,.  =  1  +  6^  and  F^.  -  7*,  =  S{S  -  1)’.  Thus 

C<d-((l  +  5*)5  +  ^(5-l)*). 

^  C  =  O(S’d’)  <  -  ATv'l  +  2M  +  AT)  =  0(N^  -  JVy/l  +  2M)  (3.3) 

Case  2:  >  JV  <  W:  As  in  Case  1, 


C<d*((l  +  5*)5  + 5(5-1)*). 


.  w.  r,  »■:'  •  ■.  1 


^  C  =  0(^’<i*)  <  0(5iVd*)  <  0{2MN  -  2A/v/l  +  2Af  +  2Af)  =  0(AfiV  -  Af v/l  +  2A/) 

(3.4) 

Case  3:  5*  >  AT:  Then 

C  <  NSd'  <  2MN  -  2W2M+1  +  1  =  ©(MJV  -  M\/2M+\)  (3.5) 


To  sununame: 


C  = 


0{N^  -  Ny/lMTl),  2iMi  <N  <M-¥1 
o\mN  -  MVl  +  2M),  <  iSr  <  2T AfS 


Observe  that  the  performance  degrades  with  the  density  of  the  graph.  It  is  impor¬ 
tant  to  note  that  it  may  be  possible  to  modify  the  algorithm  to  eliminate  all  broadcasts  of 
neighborhood  sets  of  leaders.  This  would  surely  complicate  the  protocol,  and  it  is  for  rea¬ 
sons  of  exposition  that  we  chose  not  to  do  this.  However,  such  a  modification  would  reduce 
C  by  a  factor  of  about  resulting  in  better  performances  (about  0[N)  and  0(N^))  for 
the  2  cases  in  the  summary  above. 

The  Time  complexity  is  determined  quite  simply.  Every  time  a  leader  is  declared, 
4  types  of  messages  are  broadcast,  and  all  of  the  type  are  broadcast  at  the  same  time.  So 

r  <  4<r  =  4(JV  -  V2M  +  1  +  1)=>T  =  0{N  -  V2M  +  1).  (3.7) 


3.7.  The  Greedy  Linked  Cluster  Algorithm  (GLCA) 

The  aim  of  this  section  is  to  extend  DISTG  to  accomodate  effects  such  as  interfer¬ 
ence,  and  the  lack  of  lower  level  acknowledgement  schemes. 

There  are  2  kinds  of  collisions:  inter  and  intra  leader.  Recall  that  all  communica¬ 
tion  in  DISTG  is  triggered  by  a  node  declaring  itself  a  leader,  and  consists  of  4  kinds  of 
messages.  By  intrarleader  collisions  we  refer  to  those  collisions  which  occur  because  of  the 
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communication  triggered  by  the  same  leader.  Inter-cluster  coUisons  are  defined  analogously. 
Since  a  node  has  rather  restricted  knowledge  of  what  is  going  on  in  the  rest  of  the  network, 
collisions  are  inevitable  within  the  framework  of  DISTG. 

Typel  intra-leader  collisions  can  be  eliminated  quite  simply  as  follows:  include  a 
schedule  of  how  the  the  neighbors  of  a  newly  declared  leader  should  broadcast  within  the 
typeO  message.  This  is  done  without  any  wastage  of  the  access  channel  because  the  leader 
knows  the  connectivities  of  its  neighbors,  and  can  thus  tell  when  2  or  more  of  these  neighbors 
can  broadcast  simultaneously.  When  a  typel,  or  type2  message  is  broadcast  by  a  node,  it 
can  similarly  schedule  the  broadcasts  of  its  neighbors,  thus  minimizing  collisions  to  some 
extent.  However,  in  these  cases  the  method  may  not  be  efficient. 

Inter-leader  collisions  appear  to  be  very  hard  to  predict,  and  there  may  be  no  way 
to  minimize  them.  Notice  however,  that  TypeO  inter-leader  collisions  never  occtir  since  2 
nodes  may  declare  themselves  to  be  leader  only  if  they  do  not  share  any  FREE  neighbors. 

The  next  step  is  to  describe  a  broadcast  protocol  which  resolves  the  collisions  which 
do  occur.  Ginsider  the  following  simple  scheme.  Time  is  slotted  such  that  the  length  of 
a  slot  is  equal  to  the  time  taken  to  broadcast  a  packet.  There  is  a  global  clock  so  that 
the  slots  are  perfectly  synchronized.  Packets  are  broadcast  at  the  beginning  of  a  slot.  The 
transmission  probability,  p\  of  a  node,  t,  is  the  likelihood  of  it  transmitting  a  packet  in  a 
given  slot  given  a  message  to  be  transmitted.  We  set  pj  =  (maxy^^^^,)  This 

access  scheme  has  a  throughput  of  at  least  To  see  this  suppose  that  node  i  has  A 
neighbors.  It  is  clear  that  pi  <  Vy  €  N{i).  The  conditional  probability  of  a  packet 
from  t  getting  through  in  some  slot,  given  that  t  does  transmit  a  packet  in  that  slot  is  thus 
>  (1  —  which  converges  from  above  to  i.  Retransmission  probabilities  for  collided 

packets  are  adjusted  to  maintain  the  values  of  p\.  Unfortimately,  expected  delays  might 
be  quite  high  for  this  scheme,  but  it  seems  quite  difficult  to  develop  multiaccess  strategies 
for  this  situation  which  assure  both  high  throughput  and  low  delay.  Note  that  since  the 


packet  lengths  are  very  small,  a  TDMA  scheme  over  all  nodes  of  the  network  might  be 
reasonable  in  terms  of  throughput  and  delay.  However,  every  node  would  then  have  to 
know  the  identities  of  all  the  other  nodes  in  the  network;  also,  additions  and  deletions  of 
nodes  would  become  difficult. 

We  now  deal  with  the  problem  of  acknowledgements.  It  is  clear  that  the  existence 
of  some  acknowledgement  scheme  is  essential  to  the  working  of  any  protocol.  Observe  that 
typel  messages  acknowledge  typeO  messages,  type2  messages  acknowledge  typel  messages, 
and  typeS  acknowledge  type2.  However,  type  3  messages  are  never  acknowledged,  and  other 
messages  need  not  be  acknowledged  by  all  the  sender’s  neighbors.  Another  unacknowledged 
message  is  the  initial  broadcast  of  k*.  It  is  clear  that  while  the  algorithm  does  provide  some 
form  of  acknowledgement  for  many  of  the  messages,  it  does  not  deal  with  the  problem  ade¬ 
quately.  The  precise  nature  of  a  more  complete  scheme  depends  on  the  kind  of  application, 
in  particular  on  the  nature  of  the  broadcast  channel.  For  a  good  explanation  of  the  issues 
involved  see  (18). 

3.8.  The  Ttee  Linked  Cluster  Algorithm  (TLCA) 

While  DISTG  minimises  the  number  of  clusters  very  efficiently,  it  does  this  at  a  high 
cost  of  communication,  and  is  only  applicable  to  sparse  graphs.  If  the  PRN  is  extremely 
dynamic  and  the  control  algorithm  must  be  run  very  frequently,  then  DISTG  is  not  a  good 
choice.  When  a  low  value  of  C  is  crucial,  our  standards  of  optimality  should  be  lowered. 
In  presenting  TCLA,  we  will  first  use  the  DISTG  framework  (from  section  3.1),  and  then 
suggest  ways  to  deal  with  collisions  later. 

The  idea  behind  TLCA  is  as  follows:  Initially  every  node  knows  its  second  neighbor¬ 
hood,  calculates  its  value  of  k*  (the  identity  of  the  least  numbered  neighbor  with  TnaYimiiTTi 
degree),  broadcasts  this  value,  and  waits  to  hear  the  corresponding  values  from  all  its  neigh¬ 
bors.  So  far  the  algorithm  is  identical  to  GLCA.  However,  the  values  of  k*  are  not  allowed 


to  change  for  the  rest  of  the  algorithm.  Now  observe  that  some  nodes  will  not  be  preferred 
by  any  of  their  neighbors.  One  such  node  is  the  highest  numbered  node  of  smallest  degree. 
We  call  such  nodes  leaves.  A  leaf  forces  its  preferred  neighbor  to  become  a  cluster  head 
and  then  terminates.  This  new  head  then  broadcasts  its  status,  and  all  its  non>terminated 
members  broadcast  the  fact  that  they  are  now  covered.  When  a  non-head  recieves  a  mes¬ 
sage  that  one  of  its  neighbors  is  covered,  it  disregards  that  neighbor’s  value  of  k*  and  then 
checks  to  see  if  it  is  a  leaf.  Thus  all  communication  is  triggered  by  Leaf  nodes.  The  code  of 
the  algorithm  is  presented  below: 

Data  Structures  at  node  i: 

Conneetivitv:  a  list  which  contains  the  adjacencies  of  the  first  and  second  neighborhoods  of 

I. 

Statue:  a  variable  which  takes  values  from  {TAKEN,  FREE,  HEAD}. 

a  variable  with  value  equal  to  the  identity  of  the  least  numbered  neighbor  of  i  with  the 
maximum  degree  in  N(i). 

Letatur.  a  list  of  neighbors  j  :  k}  =  t.  i.e.  those  neighbors  which  prefer  t. 

Leader,  the  identity  of  the  cluster  leader  of  node  t. 

Initialization  at  node  i:: 

Statua=FREE; 

Leader=i; 

Send(k;); 

Rec(A:y)  from  all  neighbors; 

Update  Lstatus; 


Event  CheckLeaf:  executed  after  receiving  k‘  from  all  neighbors. 


begin 

If  L8tatii8=^  then 
begin 

Send(A;,*  become  a  HEAD); 

Leader=Jb,* ; 

Status=TAKEN; 

TERMINATE; 

end 

end. 

Event  BeHead:  executed  when  Rec(i  become  a  HEAD); 
begin 

Leadersi; 

Status=HEAD; 

Send(i  HEAD); 

TERMINATE; 

end. 

Event  TellCovered:  executed  when  Rec(j  HEAD); 
begin 

If  Status=FREE  then 
begin 

Statiu= TAKEN; 

Leader=j ; 

Send(i  covered); 

end. 
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Event  Revise:  executed  when  Rec(}  covered); 
begin 

Lststus=L8tatus>{j} ; 

CheckLeaf; 

end 


A  few  comments  are  in  order.  Whenever  a  node  is  covered  by  a  head  it  sets 
Check=:FALSE,  because  it  need  never  force  another  node  to  become  a  HEAD.  However,  if 
a  FREE  node  recieves  a  message  that  one  of  its  neighbors  has  been  covered,  it  must  then 
check  to  see  if  it  is  now  a  leaf.  The  algorithm  terminates  at  a  node  when  it  becomes  either 
a  HEAD  or  a  leaf.  If  a  node  is  neither,  it  follows  that  Lstatus  ii  and  there  is  some 
uncovered  neighbor  whidi  prefers  it.  Therefore  the  algorithm  should  not  terminate  in  this 
case.  When  all  nodes  are  covered  it  is  clear  that  the  algorithm  will  stop  at  all  nodes. 

The  complexity  of  TLCA  is  now  analyzed.  A  node  does  not  broadcast  more  than  4 
times.  Two  broadcasts  go  in  establishing  the  list  Connectivity.  Subsequently  a  node  may 
be  a  leaf,  in  which  case  it  broadcasts  once;  or  a  HEAD  in  which  case  it  broadcasts  once 
to  declare  its  status,  and  it  may  have  had  to  broadcast  once  earlier  to  indicate  that  it  was 
covered  by  a  neighbor.  Thus  the  number  messages  broadcast  is: 

C  <  4M  ^  C  =  0[M) 


The  Time  complexity  depends  on  the  number  of  clusters.  This  number  is  trivially 
bounded  by  N.  For  every  cluster  leader  declared,  one  broadcast  comes  from  the  leaf;  one 
&om  the  head,  and  one  from  each  of  the  non>terminated  neighbors  of  the  head.  Thus 


T<ZN^T  =  0{N). 


TCLA  picks  {1,8,9,10,11,12,13},  whereas  Greedy  picks  {1,13} 


Note  that  the  algorithm  will  sever  choose  N  clusters,  and  it  represents  a  significant 
improvement  over  ITF  (recall  Fig.  1.4.) 

Collisiona  may  be  minimized  by  each  head  sending  a  schedule  of  when  its  neighbors 
should  transmit  that  they  are  covered.  Collisions  are  resolved  by  the  same  scheme  as  they 
were  in  GLCA.  .Acknowledgement  schemes  will  also  be  similar  for  both  algorithms. 


3.9.  Summary 

Two  Linked  Cluster  algorithms  have  been  presented  and  analyzed.  GCLA  mini¬ 
mizes  the  number  of  clusters  identically  as  Greedy.  Thus  this  algorithm  produces  cluster 
organizations  which  will  in  general,  lend  themselves  to  standard  solutions  of  the  hidden 
terminal  probem,  such  as  Busy  Tone.  However  it  does  not  fare  as  well  in  the  amount  of 
communication  required.  Since  this  will  affect  the  number  of  collisions  in  the  network  ad- 
versly,  GCLA  is  probably  best  suited  to  environments  in  which  the  topologies  are  not  highly 
dynamic.  TLCA  is  well  suited  to  more  mobile  situations  since  it  has  extremely  low  values 
of  C  and  T,  but  clusters  may  not  be  minimized  very  well.  Limitations  of  our  work,  and 
specific  suggestions  for  further  work  are  mentioned  in  the  next  chapter. 


CHAPTER  rv 

CONCLUSION  AND  SUGGESTIONS  FOR  FURTHER  WORK 


4.1.  Concluaion 


Considerable  insight  has  been  gained  in  the  difficulties  associated  with  minimizing 
the  number  of  clusters  in  stationless  PRNS’s,  and  plausible  approaches  to  solve  these  prob¬ 
lems  have  been  suggested.  We  showed  that  3  different  formulations  of  the  minimization 
problem  are  NP-complete.  It  was  then  shown  that  it  is  extremely  unlikely  that  an  efficient 
heuristic  exists  with  constant  bounded  differential  or  firactional  error.  A  simple  greedy 
heuristic  Greedy  was  analyzed  extensively  in  terms  of  its  worst-case  fractional  error,  and  it 
was  shown  that  other,  more  complicated  algorithms  do  not  behave  significantly  better  in 
the  worst  case.  It  was  also  proved  that  a  famous  bound  by  Vizing,  for  the  cardinality  of 
the  minimum  dominating  set  is  also  met  by  the  dominating  set  selected  by  Greedy. 

In  the  third  chapter,  we  presented  2  distributed  algorithms,  one  which  produces  the 
same  dominating  set  as  Greedy,  but  which  entails  considerable  overhead  in  communication 
and  is  inefficient  for  dense  graphs,  and  another  in  which  the  amount  of  communication  is 
cut  down  significantly,  but  at  the  cost  of  not  minimizing  the  number  of  clusters  as  weU  as 
Greedy.  We  do  not  claim  that  either  of  these  algorithms  could  actually  be  implemented 
without  modifications,  some  of  which  are  discussed  in  the  next  section,  but  feel  that  some 
headway  has  been  made  on  a  difficult  problem. 


4.2.  Suggestions  for  Further  Work 


In  this  section  some  of  the  limitations  of  the  work  in  this  thesis  will  be  presented, 
along  with  suggestions  for  further  research. 

There  are  a  number  of  interesting  issues  having  to  do  with  the  centralized  problem. 
First,  we  have  given  no  results  for  worst  case  performance  of  heuristics  for  CDSP  and 
SCDSP.  Second,  an  average  case  analysis  of  algorithms  such  as  Greedy,  which  have  the 
worst-case  behaviour  described  in  theorem  2.5,  would  help  better  understand  how  they 
actually  do  in  prs'?.c'>iae.  Third,  all  results  mentioned  in  this  thesis  apply  to  the  general 
case,  in  which  the  graph  topology  can  be  anything  so  long  as  it  is  connected.  However, 
under  certain  classes  of  applications,  it  might  be  reasonable  to  assume  that  for  example,  all 
transmitting  radii  are  equal,  or  that  every  radio  is  connected  bidirectionally,  to  the  same 
number  of  radios.  Another  possible  restriction  is  to  assume  that  all  radios  must  be  within 
a  certain  constrained  area  (say  a  square  of  side  K  miles.) 

We  have  some  very  preliminary  results  for  the  case  of  special  graph  topologies,  and 
mention  them  here  solely  for  the  beni&t  of  the  interested  reader  who  wishes  to  gain  some 
insight  into  these  problems. 

Lemma  4.1.  G  is  a  reprmentation  of  a  PRN  in  which  all  transmitting  radii  are  equal  only 
if  the  following  structures  are  not  subgraphs  of  G:  (Figure  4.1.) 

Proof:  Fig.  4.1(a]  can  never  be  a  subgraph  because  at  most  5  non-intersecting 
circles  of  radius  r  can  be  drawn  so  that  all  their  centers  are  contained  in  another  circle  of 
radius  R. 

Now  suppose  that  2  radios  a,  and  b  are  both  connected  to  2  other  nodes  c,  and  d.  It  is  easy 
to  see  that  the  distance  between  c  and  d  is  <  2r.  This  means  that  a  third  radio,  e  which  is 


snared  by  a  and  b  must  be  connected  to  either  c  or  to  d. 


Define  the  to  be  the  cardinality  of  the  miTn'Trmm  cardinality  maximal  indepen¬ 
dent  set  of  G. 


X,«zxima  4.2.  s  ibr  a  gnph  G,  i£  the  CoUowing  structure  is  sot  a  aubgnph: 


riz.  4.2 


Proof:  Let  D  be  a  minimum  cardinality  dominating  set,  and  suppose  that  it  is  not 
an  independent  set.  Then  let  a  and  &  be  two  adjacent  nodes  in  D.  Observe  now  that  for 
every  node  in  D  there  miut  be  at  least  one  node  (including  itself)  that  it  covers  exclusively 
i.e.  no  other  member  of  D  is  adjacent  to  it.  If  this  were  not  true  then  at  least  one  node 


could  be  removed  from  D  without  losing  the  dominating  nature  of  the  set.  So  let  be 
the  set  of  nodes  covered  exclusively  by  node  a.  If  Oa  =  {e}  then  the  set  D  U  {e}  -  {a}  is 
a  dominating  set,  and  e  is  not  adjacent  to  any  other  node  in  D.  So  let  |Oa|  >  1.  Then 
let  e  and  /  be  2  nodes  in  Now  observe  that  a  is  adjacent  to  6  as  well,  and  we  know 
that  it  cannot  be  adjacent  to  3  mutually  non*adjacent  nodes.  It  follows  that  e  and  /  are 
neighbors.  In  fact  since  these  nodes  were  arbitrarily  picked  &om  Oa,  the  members  of  this 
set  must  form  a  clique.  Then  it  is  easy  to  see  that  D  U  {e}  -  {a}  must  also  be  a  dominating 
set,  and  that  none  of  the  nodes  in  this  set  are  adjacent  to  e  (except  itself).  In  this  way  we 
can  keep  replacing  nodes  of  the  dominating  set  until  they  are  all  mutually  nonadjeicent  i.e. 
the  set  is  independent  From  this  we  conclude  that  t'mtn  is  no  bigger  than  Ko.  Now  observe 
that  every  (maximally)  independent  set  is  a  dominating  set.  Thus  tm«n  ^  Ko.  The  result 
follows. 

We  now  turn  to  the  work  in  Chapter  3.  There  are  two  major  limitations  of  the 
algorithms  presented.  First,  there  are  no  global  terminating  conditions:  i.e.  a  node  has 
no  way  of  knowing  that  the  algorithm  has  terminated  at  all  other  nodes.  Secondly,  nodes 
are  merely  assigned  to  leader;  clusters  are  not  linked  by  gateway  nodes.  Thus  our  work  on 
distributed  algorithms  can  be  improved  by  incorporating  these  features. 

It  is  clear  from  this  thesis  that  bad  linked-cluster  organizations  can  result  in  disas- 
terous  consequences.  An  interesting  area  of  research  is  to  examine  the  tradeoff  of  minimizing 
the  number  of  clusters  and  of  these  consequences.  We  also  argued  that  gateway  nodes  should 
minimized.  It  wo\ild  be  interesting  to  explore  the  tradeoffs  among  the  three  problem 
formulations:  DSP,  CDSP,  and  SCDSPm  terms  of  the  most  efficient  linked-cluster  organi¬ 
zation.  A  better  understanding  of  what  multiaccess  method  is  to  be  used  once  the  clusters 
are  set  up,  and  of  what  the  best  routing  strategy  is,  are  prerequistes  to  attempting  such  an 
analysis,  and  we  suggest  work  in  these  areas  as  well. 
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